Every troubleshooting project requires an entry point. How do you find a good place to start? Choosing poorly can mean the difference between fruitful problem-solving and a trip down the proverbial rabbit hole. It seems like it should be straightforward, but it’s often not. What’s usually obvious is the symptom, but of course that’s different than the cause. If you’re new to a machine, knowing where to begin will be an intuitive process involving trial and error. Before you’ve achieved expertise with a system, consider these promising possibilities for starting points:
Start By Duplicating The Problem
Always a great place to start, because replicating the problem allows you to start a loop where you:
- Try something
- Test to see if the failure is still present
Unfortunately, there are times where attempting to duplicate a problem isn’t a very useful beginning, like when:
- The machine has not recently been operational and there are likely to be multiple failures. The duplicability of these situations will be 100%, but that fact won’t necessarily aid problem discovery.
- The problem is intermittent.
Start With An Inspection
Giving a failed system a once-over is a great way to initiate the problem discovery phase. A basic inspection may uncover obvious signs like smoke, weird smells, or noises. These attention-grabbing symptoms may highlight the problem area, but remember not to confuse them with the actual cause of the problem. Even so, it’s a gift when the starting point is so obviously presented to you.
A basic inspection may also turn up less glaring, yet still promising leads. I’m talking about cracks, dirty contacts, frayed wires, or bent parts. You might notice a disconnected hose, a wire about to come off a post, or a loose screw. Machines are supposed to be orderly inside, so even if you aren’t an expert in how a system is supposed to work, you can still spot things that are amiss. You will hear yourself saying: “This doesn’t look right.” When searching for things out of place, include fluids not where they should be (aka, leaks).
Digital devices have an extra layer, beyond the physical, that requires inspecting. Of course, they also have problems that manifest themselves physically: I’ve smelled all kinds of wonderful things emanating from computers over the years (like burning power supplies). However, there are a whole host of problems that can lie hidden among the bits, unobservable to the naked eye. For this dimension, viewing the console, running diagnostic programs, and examining logs are the digital equivalent of the walk-around.
Start With Routine Maintenance
Any time you discover a lapse in routine maintenance, you’ve also found a great starting point. You know that maintenance regimes are designed to prevent the most common problems, so it makes sense that simply doing the recommended upkeep can bring a system back to life.
Part of routine maintenance is keeping current with the latest-and-greatest from the manufacturer and end user community. This means awareness of recalls, applying firmware or software upgrades, and adhering to any new “best practices” on how to effectively use a machine. Especially when a problem is known and a fix readily available (as with a recall or a software update), this aspect of maintenance is a great place to start a new troubleshooting project.
Start With The Manual Or Technical Support Documents
Many manufacturers kindly list the most common problems and solutions for a particular product in their manuals or support guides. Yes, you should take the time to familiarize yourself with this information. Why reinvent the wheel?
You might usually throw this kind of material away. That’s okay, because most manufacturers publish their product manuals online for easy access. If the machine in question is expensive and/or mass-produced, there may be significant third-party resources available to help you troubleshoot, perhaps with step-by-step instructions for solving even the smallest of problems. Automobiles, for example, have a large professional troubleshooting industry that spans the spectrum from do-it-yourself (manuals, parts stores, etc.) to full-service “don’t make me lift a wrench” solutions (repair shops).
Manufacturer materials are usually a big win because of the deep experience they have with their products. From design, to testing, to manufacturing, to reports from customer service, they are in a unique position to give good advice. For you, the amount of time and expertise needed to independently discover the knowledge encapsulated in even one line of a troubleshooting guide might be significant. The general principles in The Art Of Troubleshooting will help you to address any problem, but there really is no substitute for having the solution to your exact problem laid out in black and white. Don’t worry, there will be plenty of “undocumented” problems for you to solve. For the rest, feel free to take the easy route! Again, if the solution was there all along, you’re going to kick yourself later if you wasted your valuable time trying to be a hero.
Even if there isn’t a troubleshooting guide for your particular problem, contacting the manufacturer might still be fruitful: they may not be able to offer a solution, but someone in the service department should be able to steer you in the right direction.
Start By Realizing That You Are Not A Beautiful Or Unique Snowflake
At least not when it comes to troubleshooting. When fixing mass-produced items like consumer electronics or cars, take comfort in the fact that you’re not alone. There’s a good chance that someone has encountered your problem before and there is a fix or a workaround just waiting to be uncovered, should you bother to lift your fingers and type the magic words into a search engine (or your company’s issue tracking database).
I’ve been totally astonished by breakdowns I would have thought to be completely unique. We’re talking about some really obscure and exotic failures. However, after searching online forums, I discover:
- Hundreds or even thousands of others have encountered my same problem.
- A fix or workaround that is fully tested, documented, and ready to implement (sometimes years old!).
Welcome to living on a planet with 7+ billion people. Sorry, but you’ll need to find some other way to make yourself feel unique beside having the most interesting malfunction (may I suggest a vintage hat worn at a jaunty angle or perhaps a t-shirt with a clever slogan?). The upside is that you can leverage the collective experience of the masses. Save your sweat for the situations that actually require “going it alone.” Being a smart troubleshooter means tapping into the social network. It doesn’t matter to me if you solve the problem by Googling or through a Herculean triumph of reasoning.
Either way, you can take the credit. After finding a solution off the Internet (which sounds boring), I occasionally add a little drama by telling people that the answer came to me in a dream: a soaring Bald Eagle represented the engine and a man wearing a loin cloth made of bacon represented the bad spark plug…
Start Where The Graphs Get Funky
If you have solid operational data and can answer the question “What is normal?,” then a good place to start troubleshooting is with your graphs. Look for:
- Parameters outside their normal range
- Increasing/decreasing periodicity (i.e., cycles taking longer or shorter than normal)
- Upward or declining trends
Cross-comparing and overlaying data can be a powerful way to start theorizing about the cause of a problem. Let your data point the way!
Start With Recent Changes
Recent changes to a machine or its environment are great starting points for an investigation, as explained in “What’s Changed?”
Start With The People On The Front Lines
Have you talked to the people on the front lines who might know the answer? I discovered this, quite accidentally, after doing some solo analyses of downtime incidents. I would diligently collect and analyze data on my own and be kept up late at night pondering the causes of these problems.
Frequently, after I had found a solution and written a report about it (perhaps months later), I would be casually chatting with one my engineers and the topic of one of these incidents would come up. In the middle of telling my triumphant troubleshooting story, they would usually stop me and say something like: “Oh yeah, I’ve seen that before: I bet it was…”
Cut to a deflated look on my face. What had taken me days of painstaking investigative work to identify, they already knew, and were carrying around in their heads. Ugh! I would cry out “Why didn’t you say something?!” Frequently, their answer was: “No one asked me.”
After this happened a few times, I began to cast a very wide net when troubleshooting. The things I didn’t know (but were known by others) frequently surprised me, and I had been with the company from day one!
I don’t think I was oblivious, but I sure wasn’t getting our problems in front of the right people. The takeaway is: are you ignoring those who might know the answer or can quickly orient you? People on the front lines: operators, maintenance personnel, programmers, systems administrators, etc. will have relevant firsthand experiences and useful knowledge. Ask them!
Begin With This Most Excellent List Of Questions
My free 1-page Universal Troubleshooting Guide distills the strategies described in The Art Of Troubleshooting down to a powerful list of questions that you can ask yourself while actually problem-solving. It’s the best way to start any troubleshooting exercise!
*** Questions? Comments? Have a related troubleshooting story that you’d like to share? Feel free to leave your feedback in the comments section below! ***