Author Topic: Air Traffic Control failure.  (Read 3621 times)

rogerzilla

  • When n+1 gets out of hand
Re: Air Traffic Control failure.
« Reply #25 on: 31 August, 2023, 07:18:03 am »
Always book your own the moment things go pear-shaped rather than wait for the airline to recommend accommodation is my advice.
Ian is Del Griffith and ICMFP.
Hard work sometimes pays off in the end, but laziness ALWAYS pays off NOW.

Re: Air Traffic Control failure.
« Reply #26 on: 31 August, 2023, 08:08:54 am »
Quote from: Regulator
I see some of the Twitter/X/GBNews frothers are blaming the French for the ATC failure...

...despite the same information being received by a number of ATC systems and processed without a major failure.  Which would suggest that it's not the data that's the issue.
Aye.  I saw the "Crashed by dodgy French data" headlines and my first thought was, "Poor or non-existent validation of incoming data coupled with poor or non-existent exception handling when faced with utter crap."  The second was, "What a piss-poor excuse for failure *that* was."

Anyone else remember the steady stream of not at all flattering stories that appeared in CW when NATS was being commissioned and the ATC software was being written / tested *mumble* decades ago?
If, as NATS are claiming, it was a ‘corrupted’ flight plan that caused a long term national outage, then I’d be questioning the professionalism of each and every team leader from the requirements capture team, through the designers, the coders the testers and the operational acceptance team. In this day and age there is no acceptable reason that a single poor data set could crash a system. Buffer overflow errors are high school level stuff.

From what I’ve read/heard it was the input database that crashed, and they had a 4 hour buffer of live flights data. When they had failed to get the input database back online within that 4 hour window (that’s never going to happen Matt, 4 hours is ages enough to bring the database back on line) they had to revert to the manual system, that thankfully enough of the controllers could use.  However, that’s when the backlogs started.

But someone will have got their bonus on the back of saving all that money on implementing a proper backup system, and they have surely been promoted or retired by now.

So a once in a decade crash happens in a system that’s been going for decades allegedly due to a predictable problem that could occur very frequently and easily.

How strange. How very strange.



Move Faster and Bake Things

Re: Air Traffic Control failure.
« Reply #27 on: 31 August, 2023, 09:05:37 am »
Mr Smith was involved in some way in setting this system up- he makes software that tests software. I will ask.

Our QA Manager was in Magaluff at the weekend for a stag do. He had to be back at work on Tuesday as he's the Q person, other senior members of the team also being out of the country.
To get home he took a ferry to Barcelona then trains to Paris and finally a flight from Paris to NCL. His wife, home with their 6 week old was thrilled.

Re: Air Traffic Control failure.
« Reply #28 on: 31 August, 2023, 10:28:28 am »
The last stag do I went to with a manager was in Dovercourt. We still had trouble getting home.
Move Faster and Bake Things

Re: Air Traffic Control failure.
« Reply #29 on: 31 August, 2023, 10:31:43 am »
So a once in a decade crash happens in a system that’s been going for decades allegedly due to a predictable problem that could occur very frequently and easily.

How strange. How very strange.

That isn't strange at all.  Software failures don't happen randomly like a dice roll. 

Some external conditions will likely have changed - the french perhaps did make a change which resulted in slightly different input data.  The NATS system which never validated its input well enough managed for decades while certain conditions were true.  Those conditions changed and it had a problem. 

This is a really really common way for software bugs to 'appear', and it does not mean the original software was correct until the nasty input broke it.  Of course, it doesn't mean the change was handled correctly either. 

Re: Air Traffic Control failure.
« Reply #30 on: 31 August, 2023, 01:22:25 pm »
You mean like you could roll the die and get 3.7654?
Move Faster and Bake Things

Re: Air Traffic Control failure.
« Reply #31 on: 31 August, 2023, 02:59:38 pm »
Years ago I was a programmer working for a blue chip company. An area of which I had knowledge was the work flow element that was key to every activity. If this was buggered up, everything was buggered up. Later I was moved to change management. This meant I had to sign off every change and ensure it had been put through every stage of testing with sign off from all relevant users. It wasn’t a popular role with some people and I used to get managers well senior to me telling me to sign stuff off with key stages skipped or skimped. I always refused until they had referred the request to a sufficient level, usually a director, to get me off the hook if it went pear-shaped. Once the team I was in got established, system downtime plummeted.

Later the IT work was contracted out to an Indian company and I was offered redundancy which I gladly accepted. Later still I heard that the Indians were not giving our work the expected level of priority and that a lot of the changes they made caused down-time. The Technical Director was obliged to leave.

Any time accountants try to economise on IT, expect problems.
Move Faster and Bake Things

Re: Air Traffic Control failure.
« Reply #32 on: 31 August, 2023, 03:33:01 pm »

Any time accountants try to economise on IT get involved, expect problems.

Regulator

  • That's Councillor Regulator to you...
Re: Air Traffic Control failure.
« Reply #33 on: 31 August, 2023, 04:32:04 pm »

Any time accountants try to economise on IT get involved, expect problems.

Particularly if the accountants are trying to flog you the system (e.g. Test & Trace)...
Quote from: clarion
I completely agree with Reg.

Green Party Councillor

Beardy

  • Shedist
Re: Air Traffic Control failure.
« Reply #34 on: 31 August, 2023, 07:05:09 pm »
That’s a bit harsh on accountants, it’s ‘Management’ accountants that are the issues. As long as accountants stick to their professional area of expertise, no problems should be experienced. It’s when accountants primary role is defined as reducing spend that the problems occur.

The same occurs when Personnel are charged with head count reduction rather than their traditional role of managing personnel issues.
For every complex problem in the world, there is a simple and easily understood solution that’s wrong.

Re: Air Traffic Control failure.
« Reply #35 on: 01 September, 2023, 06:43:39 am »
This brazenly reposted from the PPrune thread. The characters NNNN signal End OF Message to the NATS flight system.
Aircraft names with NNN are known to trip the system.  (To be clear the poster does not say this is the cause this time)

https://www.pprune.org/rumours-news/654461-u-k-nats-systems-failure-9.html#post11494887

Re: Air Traffic Control failure.
« Reply #36 on: 01 September, 2023, 07:08:25 am »
When you know you really are far from home:


Families abandoned for 11 days abroad as airlines ignore consumer rights after flight chaos


Quote

A family facing 11 days stranded in Turkey are among thousands of customers being let down by airlines as the fallout from the UK’s air traffic control IT meltdown continues.

At least 64 flights were cancelled at UK airports on Wednesday, according to aviation analytics company Cirium, after 345 were grounded on Tuesday and 1,585 on Monday.

Meanwhile, tens of thousands of Britons are stuck abroad having to been unable to return home from holidays during one of the busiest travel weeks of the year.

Under UK law, airlines have a duty of care to customers which means they should get them to their destination at the earliest opportunity, even if this involves booking them onto a flight with another carrier.

Tour operators also have a duty of care to customers under the Government’s Package Travel Regulations and are obligated to provide “appropriate assistance without delay” including practical help “and finding alternative travel arrangements”.

Writing for i, Rory Boland, editor of Which? Travel, said the current rules are being “routinely ignored” by airlines and called on the regulator, the Civil Aviation Authority, to do more.

In 20 years, no airline in the UK has been fined for breaking consumer law.
Move Faster and Bake Things

ian

Re: Air Traffic Control failure.
« Reply #37 on: 04 September, 2023, 06:22:29 pm »
That’s why, for budget airlines, I simply book myself a decent hotel and rearrange my own trip. For BA et al. I’ll risk letting them have first shout. To be fair, what airline is going to have the staff and resources available for something like this? People know why flights are cheap right? If you want a safe, risk free option, go camping in your back garden. The weather might be rubbish but you’ll get home.

Re: Air Traffic Control failure.
« Reply #38 on: 04 September, 2023, 07:44:35 pm »
Never travel further than you can cycle home from.
Move Faster and Bake Things

Cudzoziemiec

  • Ride adventurously and stop for a brew.
Re: Air Traffic Control failure.
« Reply #39 on: 04 September, 2023, 07:49:05 pm »
Never travel further than you can cycle home from.
...in the number of days you have available.
Riding a concrete path through the nebulous and chaotic future.

Kim

  • Timelord
    • Fediverse
Re: Air Traffic Control failure.
« Reply #40 on: 06 September, 2023, 12:12:04 pm »
Preliminary report:

https://publicapps.caa.co.uk/docs/33/NERL%20Major%20Incident%20Investigation%20Preliminary%20Report.pdf


So not DNS, but duplicate waypoints with the same name (which is allowed, but evidently not programmed for) causing the system to panic (which is reasonable, given that it couldn't work out where a plane was supposed to be).  The backup system then took over and immediately died with the same error, leaving humans to pick up the pieces.

quixoticgeek

  • Mostly Harmless
Re: Air Traffic Control failure.
« Reply #41 on: 06 September, 2023, 12:15:14 pm »
Preliminary report:

https://publicapps.caa.co.uk/docs/33/NERL%20Major%20Incident%20Investigation%20Preliminary%20Report.pdf


So not DNS, but duplicate waypoints with the same name (which is allowed, but evidently not programmed for) causing the system to panic (which is reasonable, given that it couldn't work out where a plane was supposed to be).  The backup system then took over and immediately died with the same error, leaving humans to pick up the pieces.

Does this mean we can brick UK air travel by planning a flight from London to London?

J
--
Beer, bikes, and backpacking
http://b.42q.eu/

Kim

  • Timelord
    • Fediverse
Re: Air Traffic Control failure.
« Reply #42 on: 06 September, 2023, 12:26:49 pm »
Preliminary report:

https://publicapps.caa.co.uk/docs/33/NERL%20Major%20Incident%20Investigation%20Preliminary%20Report.pdf


So not DNS, but duplicate waypoints with the same name (which is allowed, but evidently not programmed for) causing the system to panic (which is reasonable, given that it couldn't work out where a plane was supposed to be).  The backup system then took over and immediately died with the same error, leaving humans to pick up the pieces.

Does this mean we can brick UK air travel by planning a flight from London to London?

Apparently so, as long as the start and end points are outside UK airspace.

Mr Larrington

  • A bit ov a lyv wyr by slof standirds
  • Custard Wallah
    • Mr Larrington's Automatic Diary
Re: Air Traffic Control failure.
« Reply #43 on: 06 September, 2023, 05:40:15 pm »
There are three Londons in USAnia and at least one in Canuckistan :demon:
External Transparent Wall Inspection Operative & Mayor of Mortagne-au-Perche
Satisfying the Bloodlust of the Masses in Peacetime

Mrs Pingu

  • Who ate all the pies? Me
    • Twitter
Re: Air Traffic Control failure.
« Reply #44 on: 06 September, 2023, 07:37:37 pm »
Preliminary report:

https://publicapps.caa.co.uk/docs/33/NERL%20Major%20Incident%20Investigation%20Preliminary%20Report.pdf


So not DNS, but duplicate waypoints with the same name (which is allowed, but evidently not programmed for) causing the system to panic (which is reasonable, given that it couldn't work out where a plane was supposed to be).  The backup system then took over and immediately died with the same error, leaving humans to pick up the pieces.

Does this mean we can brick UK air travel by planning a flight from London to London?

Apparently so, as long as the start and end points are outside UK airspace.

Sounds like a Norwegian recipe for Kristiansand and Kristiansund fuckery.
Do not clench. It only makes it worse.

ian

Re: Air Traffic Control failure.
« Reply #45 on: 06 September, 2023, 08:20:32 pm »
I, for one, have been impressed by the speed in which blokes on the internet pivoted their comprehensive expertise in the last big thing to a deep understanding of ACT systems.

quixoticgeek

  • Mostly Harmless
Re: Air Traffic Control failure.
« Reply #46 on: 06 September, 2023, 08:40:43 pm »
I, for one, have been impressed by the speed in which blokes on the internet pivoted their comprehensive expertise in the last big thing to a deep understanding of ACT systems.

Sure thing. It's not like some of us have spent over 2 decades working with high availability systems...

But sure. Carry on

J
--
Beer, bikes, and backpacking
http://b.42q.eu/

Adam

  • It'll soon be summer
    • Charity ride Durness to Dover 18-25th June 2011
Re: Air Traffic Control failure.
« Reply #47 on: 06 September, 2023, 09:04:02 pm »
Seems bizarre that having identified it didn't seem to be a valid route, rather than simply rejecting it, the system effectively switched off.

30+ years ago when I was submitting flight plans at Blackbushe Airport, I filled them out on paper, passed them to Reg, who ran the tower (technically a FIS operator not an air traffic controller but he acted like he was), who read through them and then handed them back if I'd done them wrong. 

For the interim report to state
"Clearly a better way to handle this specific logic error would be for FPRSA-R to identify and remove the message and avoid a critical exception. However, since flight data is safety critical information that is passed to ATCOs the system must be sure it is correct and could not do so in this case. It therefore stopped operating, avoiding any opportunity for incorrect data being passed to a controller"

is an abysmal admission of the failure to properly set up the automation to cover this scenario.

What's also worrying is that they couldn't easily work out which flight plan had caused the issue, which why they ran out of time for the 4 hour buffer of data and then had to switch to processing everything manually.  Surely their error logs showed what the system was doing at the time of the critical exception?



Edit:
Actually, thinking about it, as parts of NATS are based on 1960's technology, it's entirely possible the error logs are extremely limited in the actual data stored because 1960 computers obviously had very little storage space.
“Life is like riding a bicycle. To keep your balance you must keep moving.” -Albert Einstein

ian

Re: Air Traffic Control failure.
« Reply #48 on: 06 September, 2023, 09:07:37 pm »
I, for one, have been impressed by the speed in which blokes on the internet pivoted their comprehensive expertise in the last big thing to a deep understanding of ACT systems.

Sure thing. It's not like some of us have spent over 2 decades working with high availability systems...

But sure. Carry on

It might come as a disappointment, but not everything I write refers to you. Sorry.

quixoticgeek

  • Mostly Harmless
Re: Air Traffic Control failure.
« Reply #49 on: 06 September, 2023, 09:09:27 pm »
I, for one, have been impressed by the speed in which blokes on the internet pivoted their comprehensive expertise in the last big thing to a deep understanding of ACT systems.

Sure thing. It's not like some of us have spent over 2 decades working with high availability systems...

But sure. Carry on

It might come as a disappointment, but not everything I write refers to you. Sorry.

I wasn't talking about me.

J
--
Beer, bikes, and backpacking
http://b.42q.eu/