- Item 9: Set Yourself Up for Debugging Success
- Item 10: Enable the Efficient Reproduction of the Problem
- Item 11: Minimize the Turnaround Time from Your Changes to Their Result
- Item 12: Automate Complex Testing Scenarios
- Item 14: Consider Updating Your Software
- Item 15: Consult Third-Party Source Code for Insights on Its Use
- Item 16: Use Specialized Monitoring and Test Equipment
- Item 17: Increase the Prominence of a Failure's Effects
- Item 18: Enable the Debugging of Unwieldy Systems from Your Desk
- Item 19: Automate Debugging Tasks
- Item 20: Houseclean Before and After Debugging
- Item 21: Fix All Instances of a Problem Class
Item 21: Fix All Instances of a Problem Class
An error in one place is likely to also occur in others, either because a developer behaved in the same way, because a particular API can be easily misused, or because the faulty code was cloned into other places. The debugging process in many mature development cultures and in safety-critical work doesn’t stop when a defect is fixed. The aim is to fix the whole class of defects and ensure that similar defects won’t occur in the future.
For example, if you have addressed a division by zero problem in the following statement
double a = getWeight(subNode) / totalWeight;
search through all the code for other divisions by totalWeight. You can easily do this with your IDE, or with the Unix grep command (see Item 22: “Analyze Debug Data with Unix Command-Line Tools”):
# Find divisions by totalWeight, ignoring spaces after # the / operator grep -r '/ *totalWeight' .
Having done that, consider whether there are other divisions in the code that might fail in a similar way. Find them and fix those that might fail. A simple Unix pipeline can again help your search. I used the following to quickly go over suspect instances of division in a body of four million lines of C code.
# Find divisions, assuming spaces around the / operator grep -r ' / ' . | # Eliminate those involving sizeof grep -v '/ sizeof' | # Color divisors for easy inspection and # eliminate divisions involving numerical or symbolic constants grep --color=always ' / [^0-9A-Z][^,;)]*' | # Remove duplicates sort -u
Amazingly, the filters successively reduced the suspect lines from 5,731 down to 5,045, then 2,032, and finally to 1,923; an amount of data I could go over within a reasonable time. Although the filters are not bulletproof (sizeof can return zero and a symbolic constant can also evaluate to zero), examining the filtered instances is much better than avoiding the task by claiming that looking at all divisions in the code is too much work.
Finally, consider what steps you can take to avoid introducing a similar fault in the future. These may involve changes in the code or in your software development process. Here are some examples. If the fault was the misuse of an API function, consider hiding the original one and providing a safer alternative. For instance, you can add the following to your project’s global include file.
#define gets(x) USE_FGETS_RATHER_THAN_GETS(x)
Under this definition, programs that use gets (which is famously vulnerable to buffer overflows) will fail to compile or link. If the fault occurred through the processing of an incorrectly typed value, introduce stricter type checking. You can also locate many faults by adding static analysis to your build or by tightening its configuration (see Item 51: “Use Static Program Analysis”).
Things to Remember
After fixing one fault, find and fix similar ones and take steps to ensure they will not occur in the future.