- Item 9: Set Yourself Up for Debugging Success
- Item 10: Enable the Efficient Reproduction of the Problem
- Item 11: Minimize the Turnaround Time from Your Changes to Their Result
- Item 12: Automate Complex Testing Scenarios
- Item 14: Consider Updating Your Software
- Item 15: Consult Third-Party Source Code for Insights on Its Use
- Item 16: Use Specialized Monitoring and Test Equipment
- Item 17: Increase the Prominence of a Failure's Effects
- Item 18: Enable the Debugging of Unwieldy Systems from Your Desk
- Item 19: Automate Debugging Tasks
- Item 20: Houseclean Before and After Debugging
- Item 21: Fix All Instances of a Problem Class
Item 10: Enable the Efficient Reproduction of the Problem
A key to effective debugging is a problem that you can reliably and easily reproduce. You need this for a number of reasons. First, if you can always reproduce the issue with a single hit of a button, you can focus on tracking down the cause rather than wasting time randomly fumbling to make the problem appear. In addition, if you can provide an easy way to reproduce the problem, you can easily take the description and ask for outside help (see Item 2: “Use Focused Queries to Search the Web for Insights into Your Problem”). Finally, once you fix the fault, you can easily demonstrate that your fix works by running the sequence that demonstrated the problem again and witnessing that the failure no longer occurs.
Creating a short example or a test case that reproduces the problem can go a long way in increasing your efficiency. The golden standard is a minimal example: the shortest possible that reproduces the problem. The platinum standard, which goes under the name SSCCE (see Item 1: “Handle All Problems through an Issue-Tracking System”), has the example be not only short, but also self-contained and correct (compilable and runable). With a minimal example at hand, you won’t waste time exploring code paths that could have been eliminated. Also, any logs and traces you create and must examine won’t be longer than what’s actually needed. And, a short example will also execute more quickly than a longer one, especially when executed in a debugging mode that imposes a significant performance overhead.
To shorten your example, you can proceed top-down or bottom-up (see Item 4: “Drill Up from the Problem to the Bug or Down from the Program’s Start to the Bug”). Select the most expedient method. If the code has many dependencies, starting bottom-up from a clean slate may be preferable. If you don’t really understand the problem’s likely cause, creating a test case in a top-down fashion may help you narrow down the possibilities.
In the bottom-up fashion, you theorize the cause of the problem, for example, a call to a specific API, and you build up a test case that demonstrates the problem. In one case, I was trying to find out why a 27,000-line program was extremely slow in the complex code it used for processing its input files. By looking at the program’s invoked system calls, I hypothesized that the problem had something to do with calling tellg— a function returning the file stream’s offset—while reading the file. Indeed, running the following short snippet confirmed my suspicion (see Item 58: “Trace the Code’s Execution”) and was also useful to test the workaround (a wrapper class).
ifstream in(fname.c_str(), ios::binary); do { (void)in.tellg(); } while ((val = in.get()) != EOF);
In the top-down fashion, you remove elements from the scenario that demonstrates the problem, until there’s nothing left to remove. A binary search technique is often quite useful. Say you have an HTML file that makes the browser behave in an erratic way. First eliminate the file’s head elements. If the problem persists, eliminate the body elements. If that cures the problem, restore the body elements, and then remove half of them. Repeat the process until you’ve nailed down the elements that cause the problem. Keeping your editor open and using its undo function to backtrack when you follow a wrong path will mightily increase your efficiency.
With a short example at hand, it’s also easy to make it self-contained. This means that you can take the example and replicate the problem somewhere else without external dependencies, such as libraries, headers, CSS files, and web services. If your test case requires some external elements, you can bundle them with it. Use a portable notation for referring to them, avoiding things such as absolute file paths and hard-coded IP addresses. For instance use ../resources/file.css rather than /home/susan/resources/file.css, and http://localhost:8081/myService rather than http://193.92.66.100:8081/myService. A self-contained example will make it easier for you to try it on the customer’s premises, examine it on another platform (say, on Windows instead of Linux), publish it on a Q&A forum (see Item 2: “Use Focused Queries to Search the Web for Insights into Your Problem”), and ship it to a vendor for further help.
In addition, you want to work on a replicable execution environment. If you don’t nail down the code you’re working on and the system it executes in, then you might end up searching for a bug that simply isn’t there. Consider the case of debugging a software installer. Every time you install it, it messes up your operating system configuration, which is exactly what you want to avoid when you’re trying to debug it. In this case, a useful technique is to create a virtual machine image with a pristine system in a state ready for the software installation. After every failed installation, you can simply start afresh with that image. You can also often achieve a similar result using operating-system-level virtualization or containment with a tool such as Docker. Even better, consider adopting a system configuration management tool, such as Ansible, CFEngine, Chef, Puppet, or Salt. These tools allow you to reliably create a specified system configuration from your high-level instructions. This makes it easy to maintain compatible production, testing, and development environments, and to control their evolution in the same way as you control your software.
You also want to be able to reliably replicate the failing version of your software. To do this, first put your software under configuration management with a tool, such as Git. Then, make your build process embed into the software an identifier of the source code version used for the build. The following shell command will print a variable initialization with the abbreviated Git hash of the last commit, which you can embed into your source code.
git log -n 1 --format='const string version = "%h";'
Here is an example of its output.
const string version = "035cd45";
Add to your software a way to display this version string; a command-line option or a line in the About dialog are all that’s needed. With this version identifier at hand you can then obtain a copy of the failing source code with a command such as the following:
git checkout 035cd45
If you want to increase the fidelity of builds you run on old code, don’t forget to put under version control all elements that affect what ends up in your distribution, such as the compiler, system and third-party libraries and header files, as well as the build specification (the Makefiles or IDE project configuration). As a final step, you may need to remove the variability introduced by your tools and your runtime environment (see Item 52: “Configure Deterministic Builds and Executions”).
Things to Remember
Reproducible runs simplify your debugging process.
Create a short self-contained example that reproduces the problem.
Have mechanisms to create a replicable execution environment.
Use a revision control system to label and retrieve your software’s versions.