Root Cause Failure Analysis is the term used at NASA – National Aeronautic Space Administration. Due to budget cuts, NASA was recently closed, but there are some good lessons-learned from when we did have a space program. One of those is their approach to Root Cause Analysis. Their approach is interesting, but also reveals a big weakness that we can all learn from.
I’ve written before about Root Cause Analysis:
- Jeff Bezos, Root Cause Analysis
- 5 Why in Lean Startup
- Taiichi Ohno and Root Cause Analysis
- Henry David Thoreau: Hack at Branch, or Strike at Root?
The term used at NASA was Root Cause Failure Analysis. that subtle change is important because their focus was on any failure – mechanical, system, or otherwise – that prevented the successful launch and landing of a spacecraft mission. In their words,
The objective of root cause failure analysis is to identify root cause(s) so that these latent failures may eliminated or modified and future occurrences of similar problems or mishaps may be prevented. One failure of analysis pitfall: If root cause failure analysis is not performed, and the analyst only identifies and fixes the proximate causes, then the underlying causes may continue to produce similar problems or mishaps in the same or related areas.
They provide an example,
For example: A fuse blows out and cause(s) the lights to go off. You can identify the proximate cause, fuse blew, and replace the fuse. You can also identify the intermediate cause a short and repair the wire that shorted. However, if you do not identify and correct the organizational factor that led to the fuse going out (e.g., wiring not maintained because there was insufficient maintenance budget), other systems may have similar failures due to lack of maintenance. Root cause analysis seeks to identify the systemic problems, such as lack of maintenance budget, and correct these so that related problems or mishaps do not occur.
NASA also uses an approach they call a Causal Factor Tree, which is a visual picture of the root cause analysis and their potential countermeasures or, in their words, corrective actions:
One glaring outcome from the NASA approach to Root Cause Analysis is a bias toward pointing to organizational issues. For example, take a look at the example below:
Notice that in this example, the root causes that this team arrived at were issues around Budget. To some degree, this is emblematic or stereotypical of government-run programs, where the focus at times is on money, or lack thereof.
But, we know that in the true spirit of Kaizen, it is about Mind Before Money, Creativity Before Capital, and Wits Before Wallets.
NASA, we’ll miss you. It’s too bad we couldn’t sustain a long-term space program. Maybe someday. In some distant future.
Become a Lean Six Sigma professional today!
Start your learning journey with Lean Six Sigma White Belt at NO COST
Alan Longland says
This is a lovely example of one of the benefits we all gained from the work NASA did, but which was always difficult to quantify in budgetary terms.
Of course the “Kaizen” position might mitigate against the levels of staff NASA might require, but it does give us a framework to cost the failure in real terms and argue for a modified approach.
Andy Wagner says
NASA asked me to pass on the word “rumors of my death have been greatly exaggerated.”
The space shuttle program shut down, not due to budget cuts, but due to the fact that the vehicle was ~30-years old and no longer viable. The replacement vehicle was canceled due to poor program management, requirements definition, and cost control, but I think the fellows sitting on the space station right now would be a bit disturbed to find out that they were “closed.” The thousands of other NASA employers and contractors conducting space and aeronautical research and exploration, including unmanned exploration of the solar system would also be surprised by that news.
Pete Abilla says
Hi Andy – I suppose I stand corrected. Thanks for the clarification and I apologize to all the NASA employees, especially those currently in outer space.
I hate to be that guy, but NASA, did not always fully embrace the process. As I recall from Feinman’s account, they fought him openly as well as passive-agressively when he was brought in to perform the RCFA on the Challenger explosion.
Alan Charles says
I disagree with your findings and conclusion. What usually happens in root cause analysis is it is dependent on the level of management for the particular problem. A shop floor problem may stop at a root cause that is different to the findings of root cause by a senior manager. Therefore, one can always ask another why. In this situation you tend to end up with the system issue that needs addressing. This is my findings from many years of doing this at Toyota. For the NASA case, reviewing what they have done, then I agree with you it’s not right and basically they need some training in this area themselves. But to conclude big organisations find this time of root cause is not a totally correct conclusion to make for your article. If they, or anyone uses the therefore to go back up the tree is doesn’t fit in some cases. Why going down and therefore coming back up is essential to confirm the findings. Nasa’s tree approach is ok but needs to be used correctly. Please offer your services to train them. You can end up with a series of root causes that need addressing, the tree needs more branches.
What is the difference between FMEA and RCFA. In either case we do RCA to derive the Root Cause to prevent Failure.