In 1984, Rock Master Scott & the Dynamic Three released their hit song “The Roof is on Fire” that I will freely admit to first hearing in the hit movie from the same year—Revenge of the Nerds.
The song, which has been sampled or referenced by dozens of other recording artists, came to my mind recently when pondering why, when it comes to determining the root cause of data governance and data quality issues, many people all throughout the organization seem to simply throw their hands in the air—and wave them around like they just don’t care.
(Yes, I know you are waving your hands in the air right now.)
It’s as if when it comes to root cause analysis, they’re all singing the same chorus:
The Root! The Root! The Root Cause is on Fire!
We don’t want to determine why, just let the Root Cause burn.
Burn, Root Cause, Burn!
(Yes, I know you are singing the original chorus right now.)
However, determining the root cause is always an absolute necessity.
Somebody say: “Oh, yeah!” (Oh, yeah! . . . Thanks
)
The Five Whys
I have never been able to shake the habit, which most of us developed as children, where you kept asking “why?” after every answer given by an adult, and especially your parents, who ultimately had no choice but to default to “because I said so!”
Eventually you stopped asking why, assuming you would get the default answer.
However, that little “life lesson” does not serve us well in the business world.
One popular approach to root cause analysis is known as The Five Whys, where you continue to ask why (and more than five times, if necessary) in order to identify the fundamental (and often complex) cause and effect relationships underlying an issue.
When a data governance or data quality issue is first encountered, most often you are only seeing one of the possibly numerous effects of its root cause (or causes).
In her excellent blog post A Data Governance Primer (Part 1 of 3), Carol Newcomb included an excellent checklist for performing a thorough root cause analysis.
Some form of data remediation is most commonly performed whenever addressing a data governance or data quality issue.
However, root cause analysis may reveal that additional remediation is also required, sometimes for business processes, technology, or people—and sometimes, all three.
A business triage for critical problems may legitimately prioritize a reactive short-term response, which doesn’t proactively address the root cause—and therefore does not prevent or minimize its recurrence in the long-term.
However, even if you can’t immediately address the root cause, you need to always identify it—or at the very least publicly acknowledge that the root cause is currently unknown, but additional investigation is both necessary and already planned.
Let Me Hear You Sing Along
Now everybody clap your hands and stomp your feet,
While this DG/DQ-DJ scratches a funky beat.
He’s making you realize you must determine,
Why exactly the root cause keeps on burning.
Now twist and turn while your analysis churns,
And show everybody else what you just learned.
Let’s all get together and form a crowd,
While this DG/DQ-DJ plays it nice and loud.
Now everybody here in the place to be,
Let’s all get together and repeat after me.
Everybody say: “Root Cause Analysis” (Root Cause Analysis)
Everybody say: “You have to perform it!” (You have to perform it!)
Now everybody say: “Oh, yeah!” (Oh, yeah!)



#1 by Stray__Cat at June 23rd, 2010
Nice and entertaining post.
Seriously.
I’ve always been a bit disoriented by the use of root cause analysis. When it’s mentioned I can’t help feeling a bit unbalanced. If we are not talking about the root causes WHAT ARE WE TALKING ABOUT? Are we brushing the dust under the carpet?
#2 by Jim Harris at June 23rd, 2010
Thanks Augusto (aka Stray__Cat) for your comment, your feedback is greatly appreciated.
The easiest data quality example is when, driven by a business triage for critical data problems, reactive data cleansing is purposefully chosen over proactive defect prevention.
The priority is finding and fixing the near-term problems rather than worrying about the long-term consequences of not identifying the root cause and implementing process improvements that would prevent it from happening again.
Although this is sometimes the necessary immediate response, we still need to go back and, at the very least, identify what the root cause was–and it is this activity, in my experience at least, that gets swept under the carpet.
#3 by Cory Boisoneau at June 24th, 2010
Interesting post Jim! I think you have some good ideas here, but fundamentally, I have a few issues with the 5-Why’s approach.
First, 5-Why’s doesn’t really show cause and effect relationships – it shows one cause after another in a linear pattern, and very few problems I’ve ever seen occur in a linear fashion. There are always multiple causes at one point in time leading to any effect.
In addition, you talk about finding “the root cause”. I also have a fundamental problem with this, since there are always multiple root causes. Full disclosure, I work for Apollo RCA. We teach that a root cause is any cause we act upon with an effective solution. The bottom line is that we are not looking for “root cause”, we are looking for effective solutions. Thanks!
#4 by Jim Harris at June 24th, 2010
Thanks for your comment, Cory.
Yes, the fundamental cause and effect relationships underlying an issue are almost always too complex for a simple linear series of questions to isolate a single root cause, since as you said, there are often multiple causes at any one point in time, which can lead to any effect.
Approaches used in chaos theory and quantum physics would probably make for better metaphors, which is really what the 5 Whys are in my opinion–a metaphor for analyzing a problem, which will lead to implementing a more proactive solution (when one exists) than simply reacting to every problem as it happens.
In quantum mechanics, the uncertainty principle teaches us that that certain pairs of physical properties, like position and momentum, cannot simultaneously be known to arbitrary precision. That is, the more precisely one property is measured, the less precisely the other can be measured.
As a metaphor, we could use the uncertainty principle to caution against mistaking any single cause as the root cause of a complex problem.
However, one of the dangers with this more detailed–though completely valid–argument/discussion we are having is that it can actually encourage organizations to believe proactive solutions don’t really exist and provide them with another excuse to remain entirely in a reactive mode.
Specially in relation to data quality, although some defects are truly not preventable because there are not predictable, some defects are preventable, and focusing on the preventable defects is an absolute necessity.
Now, of course, you could argue that preventable defects are the only ones with a root cause because they are less complex. Yes, this may be true, which is precisely why organizations should fix them–but sadly, too often they do not even prevent the defects that do possess a simple root cause that can be easily determined–and sometimes with even less then Five Whys
Best Regards,
Jim
#5 by Mark at June 24th, 2010
Thought I’d provide a little more information about why NOT to recommend 5-Whys.
See these blog posts …
http://www.taproot.com/wordpress/2007/03/14/teruyuki-minoura-toyota-exec-talks-about-problems-with-5-whys/
http://www.taproot.com/wordpress/2009/11/25/root-cause-analysis-tip-dont-follow-bad-root-cause-analysis-advice/
http://www.taproot.com/wordpress/2009/08/26/more-bad-root-cause-analysis-advice/
Hope that get’s you thinking about other ways that are more effective for finding real root causes.
By the way – I liked your singing!
Mark
#6 by Jim Harris at June 25th, 2010
Thanks for your comment, Mark.
First of all, and especially after my previous rambling comment to Cory, I need to point out two things:
(1) I love it when people disagree with me–and not because I think it gives me an opportunity to prove them wrong and me right. Disagreement and contest–and not consensus or compromise–always lead to the best collection of information by providing multiple viewpoints.
(2) My blog post was NOT intended as an homage to The Five Whys–I DON’T think it is a great technique. My blog post IS about the importance of performing root cause analysis. I simply chose The Five Whys for its simplicity in providing ONE example of performing root cause analysis.
Therefore, thanks for providing the great links that help explain the obvious limitations of using such a simplistic technique for analyzing the root cause of a problem.
However, let me close with an example of a common–and yes, very simple–data quality problem where even a One Why technique could help determine the root cause:
A free-form text field is being used to collect the Country that a customer lives in. Since it is free-form text, you can get values such as:
Brazil
United States of America
Portugal
United States
República Federativa do Brasil
USA
Canada
Federative Republic of Brazil
Mexico
República Portuguesa
U.S.A.
Portuguese Republic
As well as values such as:
Gondor
Gnarnia
Rohan
Citizen of the World
The Land of Oz
The Island of Sodor
Berzerkistan
Lilliput
Brobdingnag
Teletubbyland
Poketopia
Florin
The first list contains countries, which are real, but a lack of standard values introduces needless variations. The second list contains fictional countries–that people like me enter into free-form fields to prove a point or simply to amuse myself (well okay, both
).
The root cause of this problem is simple–a free-form text field let’s people enter whatever they want.
The solution is equally simple–provide a drop-down box of standard values provided by an external reference authority such as a list of ISO 3166 standard country codes.
Again, my point from other rambling comment is that far too often in the data quality space, not even straightforward problems with simple root causes are being addressed.
In conclusion: Perform Root Cause Analysis–I don’t care HOW you do it, I just want you to DO it.
Thanks and Best Regards,
Jim
P.S. Thanks for liking my lyrics–trust me, you would NOT like my actual singing