General-Purpose Methods and Practices
- Item 9: Set Yourself Up for Debugging Success
- Item 10: Enable the Efficient Reproduction of the Problem
- Item 11: Minimize the Turnaround Time from Your Changes to Their Result
- Item 12: Automate Complex Testing Scenarios
- Item 14: Consider Updating Your Software
- Item 15: Consult Third-Party Source Code for Insights on Its Use
- Item 16: Use Specialized Monitoring and Test Equipment
- Item 17: Increase the Prominence of a Failure's Effects
- Item 18: Enable the Debugging of Unwieldy Systems from Your Desk
- Item 19: Automate Debugging Tasks
- Item 20: Houseclean Before and After Debugging
- Item 21: Fix All Instances of a Problem Class
Professor and author Diomidis Spinellis outlines 13 methods that can help you debug diverse software and systems failures including automation, specialized monitoring, housecleaning, and more.
The way you debug a failure often depends on the underlying technology and development platform. Yet, there are methods that you can use on a wide variety of cases.
Item 9: Set Yourself Up for Debugging Success
Software is often extremely complex. The movement of a mechanical watch comprises just over a hundred parts; the wiring of your entire home can have a few times as many simple components. Compare that with typical software systems, which easily consist of tens of thousands of complex statements. At the high end, consider the 9 million lines of code in the Linux kernel against the 4 million physical components in an A380 airliner. Your mind needs all the help it can get to conquer this complexity.
First you need to believe that the problem can be found and fixed. Your state of mind affects your debugging performance; this is what the experts call a match between perceived challenges and skills. If you don’t believe you can conquer the problem, your mind will wander around or give up. In such a case, you may also end up harming the code by patching the symptom, instead of the problem. Here is what you should keep in mind.
If a problem is reproducible, then make no mistake, you can fix it! (Often by following the advice in this book.) If it’s not reproducible, there are ways to make it so. In debugging you typically have two important allies: access to all the data you may require and powerful computers to process it. You can examine the problem manifestation, logs, source code, even machine instructions. You can also add detailed log statements (or at least monitoring probes) in any place of the software stack you want, and then use tools or short scripts to sift through volumes of data to locate the culprit. It is this combined ability to cast a wide net and dive arbitrarily deep when needed that makes debugging possible and, also, a uniquely satisfying experience.
To be effective in debugging you also need to set aside ample time. Debugging is a very demanding activity, more complex than programming, because it requires you to maintain in your brain both the program’s logic and its underlying effects—often at a low level. It also requires you to set up your environment, breakpoints, logging, windows, and test cases exactly right if the problem is to be reproduced in a productive fashion. Don’t squander all your invested time by stopping before you’ve squashed the bug, or at least until you’ve understood precisely what you need to do.
The complexity of debugging also requires you to work without distractions. Your brain needs time to enter a state called flow in which you become fully immersed and involved in an activity. According to Mihály Csíkszentmihályi, who termed it, in the state of flow you align your emotions with the task you perform. Flow can boost your persistence and performance through a sense of accomplishment. These are critical success factors for dealing with the immense difficulty of debugging complex systems. Distractions, such as a popup message, a phone call, a running chat, rolling social network updates, or a colleague asking for help will drag you out of the flow state, depriving you of its benefits. Avoid them! Quit unneeded applications, enable your phone’s silent mode, and hang a do not disturb sign on your monitor (or your office door, should you be so lucky as to have one).
Another helpful strategy is to sleep on a difficult problem. Researchers have found that during sleep our neurons make connections that generalize across seemingly unrelated paths. This can be a great help during debugging. You can often escape from what appears to be a dead end by trying an outside-the-box debugging strategy. Sleep is exactly the process needed to make this new connection. However, for this to work, you need to do it properly. Work hard on the problem before going to sleep to give your mind all the necessary data needed in order to find a novel solution to the problem. Giving up and going for a beer and then to bed at the first difficulty won’t help you a lot. Also, get plenty of sleep so that on the next day the conscious part of your brain can work effectively with the recommendations of its subconscious sibling.
Nobody said that debugging is easy, so to be effective in it you must persist. At the lowest level computers are deterministic, so they allow you to dig down until you isolate the error. At higher levels, nondeterminism (apparent randomness) is introduced to increase expressiveness and efficiency (think of threads). For nondeterministic errors, you can use the fact that computers are fast and programmable to run zillions of cases until you isolate the error. Therefore, debugging dead ends are mostly due to a lack of persistence: a missing test case, an ignored log file, or an unexplored angle of attack.
Finally, as an effective debug engineer, you must continuously invest in your environment, tools, and knowledge. Only in this way will you be able to keep your edge over the ever-increasing complexity of the technology stack you’re working on. In retrospect, my most common debugging mistake is insufficient investment in setting up my debugging infrastructure. This may involve failing to do any of the following:
Prepare a robust minimal test case (see Item 10: “Enable the Efficient Reproduction of the Problem”)
Automate the bug’s reproduction
Script a log file’s analysis
Learn how an API or language feature really works
Once I summon the energy to invest in what’s needed, my debugging productivity receives a large boost. From that point onward, I can often pinpoint the bug in minutes.
Things to Remember
Believe that the problem can be traced and fixed.
Set aside sufficient time for your debugging task.
Arrange to work without distractions.
Sleep on a difficult problem.
Don’t give up.
Invest in your environment, tools, and knowledge.