Difference between revisions of "Root Cause Analysis"

From CitconWiki
Jump to navigationJump to search
(Created page with "Squirrel has slides on how to go about doing a root cause analysis ===Target a specific event=== could do a root cause analysis on a "big" event over time, like as part of a mas...")
 
Line 1: Line 1:
Squirrel has slides on how to go about doing a root cause analysis
+
09:00 on Saturday, Nov 12, 2011 morning in Space Invaders (the big room)
 +
 
 +
Squirrel has slides on how to go about doing a root cause analysis (PJ reminder: get the slides from Squirrel to attach to the wiki)
 +
 
  
 
===Target a specific event===
 
===Target a specific event===
Line 18: Line 21:
  
 
===Poll to identify problems===
 
===Poll to identify problems===
 +
Go around entire room and ask "Hey PJ, please list all the problems"
 +
Then go around the room and ask for add ons
 +
Private ballots on post-its. Email solicitation.
 +
Try to avoid proxies. Get the right people in the room
 +
 +
===Write alot===
 +
 +
===Move down then across===
 +
 +
===If it doesn't hurt, then you aren't doing it right===
 +
 +
===Proportionate tasks===
 +
If you are re-writing your entire app because of a 3 minutes of down time, then you are not doing the right thing
 +
 +
===All tasks done in a week===
 +
Every task agreed to:
 +
1) Has to be do-able in one week
 +
2) Has to actually be done in one week
 +
 +
How does this compare to retrospectives?
 +
Retros are related to teams, the pain is more direct
 +
 +
Other techniques for NOT losing focus?
 +
Keep it short term
 +
The next root cause analysis might highlight the "next" step, but for now, "all we have to do now is take this first step"

Revision as of 02:30, 12 November 2011

09:00 on Saturday, Nov 12, 2011 morning in Space Invaders (the big room)

Squirrel has slides on how to go about doing a root cause analysis (PJ reminder: get the slides from Squirrel to attach to the wiki)


Target a specific event

could do a root cause analysis on a "big" event over time, like as part of a master's thesis can be helpful to start with the "level" of pain defects are not "really" defects, they are misunderstanding should do production bugs

Everyone affected attends

the "feature" team attends, what about senior managers? representatives from other areas of the business not always good at getting "everyone" in the room one technique is to give them results from one they did not show up for

No blame

Ops folks tend toward blame Need to set it up ahead of time to avoid blame... "inoculate" people against blame Anti-pattern: as long as it isn't MY discipline, then I have gotten what I want out of this session

Poll to identify problems

Go around entire room and ask "Hey PJ, please list all the problems" Then go around the room and ask for add ons Private ballots on post-its. Email solicitation. Try to avoid proxies. Get the right people in the room

Write alot

Move down then across

If it doesn't hurt, then you aren't doing it right

Proportionate tasks

If you are re-writing your entire app because of a 3 minutes of down time, then you are not doing the right thing

All tasks done in a week

Every task agreed to:

1) Has to be do-able in one week
2) Has to actually be done in one week

How does this compare to retrospectives? Retros are related to teams, the pain is more direct

Other techniques for NOT losing focus? Keep it short term The next root cause analysis might highlight the "next" step, but for now, "all we have to do now is take this first step"