Management decisions made during Columbia's final flight reflect missed opportunities, blocked or
ineffective communications channels, flawed analysis, and ineffective leadership.
- Columbia Accident Investigation Board Report, August 2003, Vol. 1, p. 170
The breakup during re-entry of the space shuttle Columbia was caused by damage to thermal tiles from lift-off debris. The risk
to the shuttle was known during the flight; unfortunately, bad information management prevented any concerted action.
Post-accident, the Columbia Accident Investigation Board found fault with the Shuttle Program's group decision-making
Early missions adhered to safety standards that specified a zero-tolerance level for External Tank debris. When in
1988 the Space Shuttle Atlantis sustained a large debris strike during liftoff, a video of the damage was filmed
in-flight by a robot camera and then passed to NASA engineers for extensive analysis. After the successful re-entry
it was determined that a burn-through may have been possible had the extensive tile damage been elsewhere on the hull.
Complacency increased when debris strikes in five other missions did not cause burn-throughs. The behavioral outcome
was that fourteen years after the Atlantis damage the Program Managers of the Columbia declined crew inspection,
decided against earth and satellite-based imaging, and deliberately discounted the concerns of the Debris Assessment Team in
favor of the status-quo "in-family" classification. The Board viewed this lack of institutional memory as a systematic
failure by NASA to capture knowledge, there being no mechanism for a democratic expression of opinions.
The Board's report discusses the organizational causes of these problems. In particular, there were a large number
of communications breakdowns:
Communication did not flow effectively up to or down from Program Managers. (p. 172)
Much of this was attributed to the complexity of a Space Shuttle flight; over 5,000 critical items were tracked
for the Columbia's mission. The staff, under time pressure and downsized from budget cuts, was unable to properly
focus on all the issues nor direct attention to any areas of concerns. Adding to the difficulty were the large
numbers of agencies and sub-contractors that had to be contacted for input on any particular issue.
Complementing the inadequate communication was a misalignment of relevant information with mission risk:
Yet integrated hazard reports and risk analyses are rarely communicated effectively, nor are the many databases
used by the Shuttle Program engineers and managers capable of translating operational experiences into effective
risk management practices. (p. 189)
As a result, the Shuttle Program was not able to generate performance trends from the data they had collected from
previous missions. Had there been a confidence measure for the information surrounding the Columbia lift-off debris
strike, it is possible the Manager's may have become aware of the potential risk. The Board did conclude that a rescue,
ironically by the Atlantis, might have been possible.
The Stop&Think method would have raised the visibility of the debris strike problem through the preliminary
step of problem definition. Next, by applying FastFusion, a task force composed of both Shuttle Program Managers and Debris Assessment
Team managers would most likely have focused on the institutional knowledge of the Atlantis burn-throughs. Finally, a low ST-Index
(usability measure) - signifying a hazard - was likely given the five previous debris strikes and the unmanageable number
of critical items.