Troubleshooting is often a solitary exercise: the cost of labor favors the lone wolf problem-solver. The cable guy is usually—just one guy. I’ve never called a plumbing service and had a whole team of plumbers show up. Especially when repair work is an unreimbursed cost for a company (e.g., on-site warranty service), it’ll likely be just one person on the call.
I’ve seen “collaborative environments,” where troubleshooters work in close proximity and can rely on each other for help. Think about an auto repair shop where the mechanics have their own repair bays, but colleagues are just a shout away to provide assistance. Still, this is only a small tweak on the solitary problem-solving routine: you’re expected to work by yourself and others are consulted only when you’re stuck.
Of course, there’s another way to troubleshoot: with a team. I’ve worked on large-scale problems that were too complicated (and too important) to leave to a single person. When the company’s future is on the line, expect to be given all the resources you need to resolve an issue. That will inevitably include some extra personnel.
However, you might not be used to problem-solving in a group setting. If you are lucky enough to command an entire troubleshooting team, you’ll want to make the most of this precious human resource. Always start with a brainstorming and organizing session where you discuss strategies and assign roles. Since you’ll have everyone’s attention, this is the time to really open things up and consider all of your options. During this meeting, you’ll also want to address these issues:
Especially if you’re troubleshooting in the midst of a crisis, there will be lots of people interested in the outcome: customers, management, other teams, etc. You don’t want these requests for updates to be a constant source of interruptions for the people actually solving the problem. Therefore, be proactive and assign someone to take on the role of the communicator. They will relay the team’s status to interested parties and free the rest of your team to actually get some work done. Another tip: have this person set clear expectations as to the frequency of communication. You will receive fewer interruptions from interested parties if they know they will be receiving updates on a regular basis.
Alternatives And A “Plan B”
Of course, you should put most of your personnel on the “high-percentage play” (i.e., the strategy you feel is most likely to pay off). But, if your team is large enough, consider breaking into multiple teams, each of which will pursue alternate theories regarding the cause. One such team can even be assigned to “Plan B,” preparing for the eventuality that you will not find the cause or fix the problem. This team could be building out a backup system or investigating workarounds.
Once you’ve decided on a strategy, put limits on how long you’ll work without finding a solution (by the way, this is a good idea for the solo troubleshooter as well). The threshold can be the achievement of a specific goal, but always include a time limit as well. For example:
“We’ll spend the next 2 hours trying to determine if replacing the valve will bring the pressure back to normal levels. If the pressure rises above 100 lbs./sq. in., then we’ll stop and shut off the boiler to avoid an explosion.”
The thresholds set in this example are:
- Timed goal: work through a specific fix for the next 2 hours, then break and reassess.
- Pressure: stop and mitigate if the pressure rises above 100 lbs./sq. in.
Time limits are a crucial backstop to arrest the momentum of a plan that is going nowhere. Tunnel vision can be difficult to overcome unless there’s a prearranged means to stop the madness.
Keep Someone “Up Above It”
One person needs to be managing and monitoring everything listed above and making corrections as needed. Ideally, they’re not involved with the actual troubleshooting and therefore won’t get swept up in the details. If you’ve set thresholds, this person is watching the timer and the other parameters you’ve agreed upon. If you’re pursuing multiple alternatives, they’re checking in with the various teams on a regular basis, synthesizing this information and deciding to either maintain course or change direction.
Teamwork, An Every Day Thing
If you have the latitude to change how you problem-solve, consider introducing more teamwork. Specifically, try it in pairs. I’ve done “pair troubleshooting” on numerous occasions and have found working with a competent partner has many benefits:
- Improved quality of work: problem-solving with someone else “keeps you honest.” Knowing your work is being scrutinized makes you less likely to take the “easy road”—this means fewer shoddy repairs. Fixes will typically conform to the person with higher standards.
- Faster resolution: more collective experience means it’s likely that someone has “seen this one before.” Four eyes will view the problem more clearly and four hands will speed the work along.
- Less chance of pursuing dead-ends: if you’re on the “road to nowhere,” someone will get bored or frustrated and demand a reassessment of the situation.
- Better ideas: two different points of view, and you both can ask “stupid” questions to get the team unstuck.
- Keeps you externally focused: the interactive and social aspects of working with a partner is another hook to remaining present while problem-solving.
Of course, putting two workers on a troubleshooting project has an opportunity cost: they could be pursuing their own repair projects individually. If two employees are able to solve a problem in the same amount of time that a single worker could, all pair troubleshooting would do is double your labor costs! However, my impetus to experiment with pair troubleshooting originated from my very positive experiences with “pair programming” in the world of software development. Researchers Alistair Cockburn and Laurie Williams studied pairing in that context and found:
The significant benefits of pair programming are that
- many mistakes get caught as they are being typed in rather than in QA test or in the field (continuous code reviews);
- the end defect content is statistically lower (continuous code reviews);
- the designs are better and code length shorter (ongoing brainstorming and pair relaying);
- the team solves problems faster (pair relaying);
- the people learn significantly more, about the system and about software development (line-of-sight learning);
- the project ends up with multiple people understanding each piece of the system;
- the people learn to work together and talk more often together, giving better information flow and team dynamics;
- people enjoy their work more.
The development cost for these benefits is not the 100% that might be expected, but is approximately 15%. This is repaid in shorter and less expensive testing, quality assurance, and field support.
“The Costs and Benefits of Pair Programming” by Alistair Cockburn and Laurie Williams.
To the best of my knowledge, no one has studied “pair troubleshooting,” but personal experience tells me the benefits are similar. Apart from software development, I also paired up systems administrators on my teams and had them troubleshoot together. I would overhear them catching each other’s errors—mistakes that would have been costly to fix later on! I was also impressed with how knowledge got spread around by team members collaborating and talking to each other (the study mentions this benefit as “the project ends up with multiple people understanding each piece of the system”).
I think the strongest argument for the adoption of pair troubleshooting is the quality of work produced. The Cockburn/Williams study showed that pairs produced software with 15% fewer defects (up to a 50% reduction in other studies) with only an increased cost of 15%. Depending on the circumstance, higher quality work can more than justify the additional cost of pairing. If you’re in an industry where defects in repair work are extremely costly (or deadly), then it warrants your consideration. Software bugs can have a huge cost: they suck up the time of customer service reps, field technicians, and must be eventually found and fixed (for more on the topic, see the “Economics” section in the Cockburn/Williams study, it’s quite compelling). Likewise, “bugs” in your repair work can also exact a hefty price: you know this all too well if you’ve ever been called back to fix something you thought was already fixed. Customers want the confidence of knowing it was fixed right the first time!
Pair troubleshooting won’t be a good fit in all circumstances: your industry’s economics and organization’s culture will circumscribe the possibilities for introducing teamwork. Personnel is a factor too: I’ve worked with some “lone wolf” types that were quite resistant to the concept. Can you convince them to hunt with the pack?
*** Questions? Comments? Have a related troubleshooting story that you’d like to share? Feel free to leave your feedback in the comments section below! ***
- Alistair Cockburn and Laurie Williams, “The Costs and Benefits of Pair Programming.”