- Item 9: Set Yourself Up for Debugging Success
- Item 10: Enable the Efficient Reproduction of the Problem
- Item 11: Minimize the Turnaround Time from Your Changes to Their Result
- Item 12: Automate Complex Testing Scenarios
- Item 14: Consider Updating Your Software
- Item 15: Consult Third-Party Source Code for Insights on Its Use
- Item 16: Use Specialized Monitoring and Test Equipment
- Item 17: Increase the Prominence of a Failure's Effects
- Item 18: Enable the Debugging of Unwieldy Systems from Your Desk
- Item 19: Automate Debugging Tasks
- Item 20: Houseclean Before and After Debugging
- Item 21: Fix All Instances of a Problem Class
Item 16: Use Specialized Monitoring and Test Equipment
Debugging embedded systems and systems software sometimes requires you to be able to inspect and analyze the whole computing stack, from the hardware to the application. Deep down at the hardware level, debugging involves detecting minute changes in electron flows and the alignment of magnetic moments. In most cases, you can use powerful IDEs as well as tracing and logging software to see what’s going on. Yet there are situations where these tools leave you none the wiser. This often happens when software touches hardware: you think your software behaves as it should, but the hardware seems to have its own ideas. For example, you can see that you write the correct data to disk, but the data appears corrupt when you read it. When you debug problems close to the hardware level, some fancy equipment may offer you significant help.
A general-purpose tool that you may find useful is a logic analyzer. This captures, stores, and analyzes digital signals coming in at speeds of millions of samples per second. Such devices used to cost more than a new car, but now you can buy a cheap USB-based one for about $100. With such a device you can monitor arbitrary digital signals on a hardware board but also higher communication protocols that components use to talk to each other. The alphabet soup and number of supported protocols is bewildering; quoting from one manufacturer (saleae): “SPI, I2C, serial, 1-Wire, CAN, UNI/O, I2S/PCM, MP Mode, Manchester, Modbus, DMX-512, Parallel, JTAG, LIN, Atmel SWI, MDIO, SWD, LCD HD44780, BiSS C, HDLC, HDMI CEC, PS/2, USB 1.1, Midi.”
If you specialize in a particular technology, you may want to invest in dedicated equipment, such as a protocol analyzer or bus analyzer. As an example, vehicle and other microcontrollers often communicate over the so-called CAN (controller area network) bus. A number of companies offer stand-alone self-contained modules that can plug into the bus and filter, display, and log the exchanged traffic. Similar products exist for other widely used or specialized physical interconnections and protocols, such as Ethernet, USB, Fibre Channel, SAS, SATA, RapidIO, iSCSI, sFPDP, and OBSAI. In contrast to software-based solutions, these devices are guaranteed to work at the true line rate, they offer support for monitoring multiple traffic lanes, and they allow you to define triggers and filters based on bit patterns observed deep inside the data packet frame.
If you lack dedicated debugging hardware, don’t shy away from improvising something to suit your needs. This can help you investigate problems that are difficult to replicate. A few years ago we faced a problem with missing data from a web form drop box. The trouble was that the occurrence of the problem was rare and impossible to reproduce, though it affected many quite vocal users for days. (The application was used by hundreds of thousands.) By looking at the distribution of the affected users, we found that these were often based in remote areas. Hypothesizing that the problem had to do with the quality of their Internet connection, I took a USB wireless modem, wrapped it in tinfoil to simulate a marginal connection, and used that to connect to the application’s web interface. I could immediately see the exact problem, and, armed with an easy way to replicate (see Item 10: “Enable the Efficient Reproduction of the Problem”) it, we were able to solve it in just a few hours.
If the code you’re trying to debug runs as embedded software on a device that lacks decent I/O, there are several tricks you can play to communicate with the software you’re debugging.
If the device has a status light or if it can beep, use coded flashes or beeps to indicate what the software is doing. For example, one short beep could signify that the software has entered a particular routine and two that it has exited. You can send more sophisticated messages with Morse Code.
Store log output in non-volatile storage (even on an external USB thumb drive), and then retrieve the data on your own workstation in order to analyze it.
Implement a simple serial data encoder, and use that to write the data on an unused I/O pin. Then, you can level-convert the signal to RS-232 levels and use a serial-to-USB adapter and a terminal application to read the data on a modern computer.
If the device has a network connection, you can obviously communicate through it. If it lacks the software for network logging (see Item 41: “Add Logging Statements”) or remote shell access, you can communicate with the outside world through HTTP or even DNS requests.
When you’re monitoring network packets, you can set up your network’s hardware in a way that allows you to use a software packet analyzer, such as the open-source Wireshark package. The Wireshark version running on my laptop claims to support 1,514 (network and USB) protocols and packet types. If you can run the application you want to debug on the same host as the one you’ll use Wireshark, then network packet monitoring can be child’s play. Fire up Wireshark, specify the packets you want to capture, and then look at them in detail.
Monitoring traffic between other hosts, such as that between an application server and a database server or a load balancer, can be more tricky. The problem is that switches, the devices that connect together Ethernet cabling, isolate the traffic flowing between two ports from the others. You have numerous options to overcome this difficulty.
If your organization is using a managed (read “expensive”) switch, then you can set up one port to mirror the traffic on another port. Mirroring the traffic of the server you want to monitor on the port where your computer running Wireshark is connected allows you to capture and analyze the server’s traffic.
If you don’t have access to a managed switch, try to get hold of an Ethernet hub. These are much simpler devices that broadcast the Ethernet traffic they receive on all ports. Hubs are no longer made, and this is why they’re often more expensive than cheap switches. Connect the computer you want to monitor and the one running Wireshark to the hub, and you’re ready to go.
Yet another way to monitor remote hosts involves using a command-line tool such as tcpdump. Remotely log in into the host you want to monitor, and run tcpdump to see the network packets that interest you. (You will need administrator privileges to do that.) If you want to perform further analysis with Wireshark’s GUI, you can write the raw packets into a file with tcpdump’s -w option, which you can then analyze in detail with Wireshark. This mode of working is particularly useful with cloud-based hosts, where you can’t easily tinker with the networking configuration.
One last possibility involves setting up a computer to bridge network packets between the computer you want to monitor and the rest of the network. You configure that computer with two network interfaces (e.g., its native port and a USB-based one), and bridge the two ports together. On Linux use the brctl command; on FreeBSD configure the if_bridge driver.
You can also use a similarly configured device to simulate various networking scenarios such as packets coming from hosts around the world, traffic shaping, bandwidth limitations, and firewall configurations. Here the software you’ll want to use is iptables, which runs under Linux.
Things to Remember
A logic, bus, or protocol analyzer can help you pinpoint problems that occur near the hardware level.
A home-brew contraption may help you investigate problems related to hardware.
Monitor network packets with Wireshark and an Ethernet hub, a managed switch, or command-line capture.