NOTAM a good day to fly…

There was a profound amount of head-scratching going on among outbound airline passengers Wednesday morning, and it was motivated by confusion.

The confusion wasn’t about the fact of cascading flight delays and cancellations – the increasing hemorrhage of flight listings turning to red on the reader boards bore out the deteriorating status of the system. No, what was befuddling to even the most experienced passengers was the explanation.

“Wait…what? A note system went bad and now we’re walking?”

By now, we’ve force-fed just about everyone tuned into the breaking news matrix as to what NOTAM stands for, and why we needed it Wednesday most acutely. But just to recap: NOTAM is an acronym that used to stand for “Notices to Airmen,” a method of disseminating lists of changes and inoperative runways and navigation equipment which started in the 60’s as teletype messages and grew to be a full-throated neural network that achieved the status of indispensability.

Around fifty years ago (yes, Virginia, a half-century), believe it or not, commercial aviation was still engaging an increasingly esoteric debate over whether it was becoming too dependent on electronics and, in particular, computers. That seems so quaint now, since computer networks and servers and terminals are so inextricably intertwined with the airline business that almost any significant failure (read: network crash) will immediately ground an entire airline’s operations. In a nanosecond, a total loss of a carrier’s computer system leaves it unable to plan fuel loads and air routes, direct and load baggage, track or direct its flight crews, check in or board passengers (or even get them through security), answer a growing tsunami of calls to reservations, and perhaps worst of all, prevent them from legally dispatching a single flight.

The danger of computer catastrophes to one’s bottom line is so profound now that virtually all carriers go to great and expensive lengths to have independent, parallel systems that not only back up the main system, but can become the main system in seconds. With each airline’s networks interfacing with the FAA’s Central Command Center in Herndon, Virginia, when everyone and everything is running normally the resulting ballet of 32-thousand domestic flights daily becomes a thing of wonder and beauty.

Ok, but what of NOTAMS (now called Notice to Air Missions)?

Well, you wouldn’t want to try to fly to, say, JFK in a snowstorm if you didn’t know precisely what to expect in terms of runway or taxiway closures, or without knowing what instrument approach systems have been compromised. You wouldn’t want to fly to Minneapolis in a blizzard if you didn’t have completely accurate and up to date information on whether the snow plows were keeping ahead of the storm. That’s what the NOTAM system provides – information, and a lot of it, including information about weather warnings en route and the condition of the airport you’re leaving. Take away any given piece of the information puzzle and probably nothing bad will happen. But start crashing that neural network wholesale and depriving aircrews and dispatchers of this vital intelligence flow and continued operation would clearly compromise what we swear to provide: The highest level of safety humanly (and cyberly, if there were such a word) possible.

Yes, FAA got the NOTAM system back up early Wednesday before they issued a complete “ground stop.” But for about an hour, they were seeing incorrect information popping up. When you see that Runway 18 is open at Atlanta Jackson Airport, for instance, it is not going to be a confidence builder to know that Atlanta doesn’t have a Runway 18! (The Atlanta example, by the way,  is a construct).

So, the FAA did what aviation should always do when faced with safety-related uncertainty: they acted on the side of caution and pulled the plug by issuing the first national ground stop since 9/11, a brave act which rattled the country.

Did you notice I haven’t discussed blame? That’s because to make this debacle not happen again, blame must be discarded. Instead, we need to know everything that contributed to the situation so we can fix all of it and learn the lessons the first time. Yes, there was a human failure in mistakenly loading a piece of contaminated software, but that’s merely one of links in the causal chain, and it does not eclipse the fact that the NOTAM system had grown like Topsy and become a patchwork of older computers increasingly vulnerable to a cascading failure.

We can applaud the fact that our system remained very safe and did the right thing by going to ground.  What you can also applaud is that these days we understand that a complex adaptive human system cannot be fixed by blame and shame. And interestingly enough, we’ve also come far enough in understanding ourselves to know that the uselessness of blame and shame responses to accidents is a huge lesson equally applicable to every organization, whether a tiny team at your local McDonalds, or a mega-business like Amazon.  

So, again we can squeeze lemonade from lemons, even if your bags are still on the lam (I feel your pain).

I was happy to contribute to ABC News’ coverage of the unfolding situation on Wednesday. You can see me in this clip, if it’s of interest.