Monday, July 8, 2013

Cascading Fail: A Crash That Hits Close to Home

As many of you are aware, there was an airliner coming from Seoul, Korea that crashed upon landing at San Francisco International Airport Saturday, July 6, 2013. Much has been said in the news about the crash, and much will probably be said in the following days and weeks. My point for this post is not  to talk about the crash, or the myriad of systems that were or were not available. It doesn't change the fact that an airline crashed, scores of people were injured, two girls were killed, and a major International Airport was brought to a grinding halt while the events that followed played out.

This crash had a direct impact on me, though, in two ways. First, it is the cause of my daughter, who has spent the last eight days in Japan, being unable to come home yet. Because of the shut down of the airport, many flights had to be diverted to smaller airports, which were quickly overwhelmed by the increased load. Many flights that were scheduled were cancelled, my daughter's included. Thankfully, due to some herculean and determined efforts by the team of chaperones, as well as the good will and kindness of the city of Narita, Japan, they were able to weather the hiccup. Note, this "hiccup" meant that their return was delayed by three days, including, at the current time, a rerouting through Seattle and an almost 24 hour layover.

The second way that it impacted me is knowing that, for the two girls who were confirmed dead, both were students coming over on an exchange trip from China. Both were mid teenagers. In other words, both were mirroring my own daughter's trip. I am a realist, and I understand Black Swan events, and the likelihood of a repeat performance is way less than a million to one, but that doesn't calm a father who now has new uncertainties and anxieties. Needless to say, it was a little too close to home.

Note, I'm not blaming anyone for this, but in the tester's world that I inhabit, rarely is there such a thing as a "isolated problem". Usually, when something goes wrong, it affects entire systems, and those systems also affect entire systems. The net results of an error could cascade out and cause devastating problems, and the after effects not seen until well after this issue has occurred. Yes, I am one of those people who can see a testing story in everything, but today I am doubly reminded of how early mistakes not caught can ripple out, and we really have no way of knowing just how far the problems  can cascade. We may have a momentary interest, as long as the issue doesn't affect us. Once it does, though, we start to see a much broader world of issues and problems. A good reminder to me in my workaday world to try to find problems early, while the course correction options are much wider and still possible.

No comments: