Space Shuttle Columbia Disaster: a lesson in information breakdown

Management decisions made during Columbia's final flight reflect missed opportunities, blocked or ineffective communications channels, flawed analysis, and ineffective leadership.

   - Columbia Accident Investigation Board Report, August 2003, Vol. 1, p. 170

Space Shuttle The breakup during re-entry of the space shuttle Columbia was caused by damage to thermal tiles from lift-off debris. The risk to the shuttle was known during the flight; unfortunately, bad information management prevented any concerted action. Post-accident, the Columbia Accident Investigation Board found fault with the Shuttle Program's group decision-making practices.

Restricted leadership

Early missions adhered to safety standards that specified a zero-tolerance level for External Tank debris. When in 1988 the Space Shuttle Atlantis sustained a large debris strike during liftoff, a video of the damage was filmed in-flight by a robot camera and then passed to NASA engineers for extensive analysis. After the successful re-entry it was determined that a burn-through may have been possible had the extensive tile damage been elsewhere on the hull.

Complacency increased when debris strikes in five other missions did not cause burn-throughs. The behavioral outcome was that fourteen years after the Atlantis damage the Program Managers of the Columbia declined crew inspection, decided against earth and satellite-based imaging, and deliberately discounted the concerns of the Debris Assessment Team in favor of the status-quo "in-family" classification. The Board viewed this lack of institutional memory as a systematic failure by NASA to capture knowledge, there being no mechanism for a democratic expression of opinions.

Poor communication

The Board's report discusses the organizational causes of these problems. In particular, there were a large number of communications breakdowns:

Communication did not flow effectively up to or down from Program Managers. (p. 172)

Much of this was attributed to the complexity of a Space Shuttle flight; over 5,000 critical items were tracked for the Columbia's mission. The staff, under time pressure and downsized from budget cuts, was unable to properly focus on all the issues nor direct attention to any areas of concerns. Adding to the difficulty were the large numbers of agencies and sub-contractors that had to be contacted for input on any particular issue.

Knowledge loss

Complementing the inadequate communication was a misalignment of relevant information with mission risk:

Yet integrated hazard reports and risk analyses are rarely communicated effectively, nor are the many databases used by the Shuttle Program engineers and managers capable of translating operational experiences into effective risk management practices. (p. 189)

As a result, the Shuttle Program was not able to generate performance trends from the data they had collected from previous missions. Had there been a confidence measure for the information surrounding the Columbia lift-off debris strike, it is possible the Manager's may have become aware of the potential risk. The Board did conclude that a rescue, ironically by the Atlantis, might have been possible.

Stop&Think solution

The Stop&Think method would have raised the visibility of the debris strike problem through the preliminary step of problem definition. Next, by applying FastFusion, a task force composed of both Shuttle Program Managers and Debris Assessment Team managers would most likely have focused on the institutional knowledge of the Atlantis burn-throughs. Finally, a low ST-Index (usability measure) - signifying a hazard - was likely given the five previous debris strikes and the unmanageable number of critical items.



