- Into the House of Logic
- Should Reverse Engineering Be Illegal?
- Reverse Engineering Tools and Concepts
- Approaches to Reverse Engineering
- Methods of the Reverser
- Writing Interactive Disassembler (IDA) Plugins
- Decompiling and Disassembling Software
- Decompilation in Practice: Reversing helpctr.exe
- Automatic, Bulk Auditing for Vulnerabilities
- Writing Your Own Cracking Tools
- Building a Basic Code Coverage Tool
- Conclusion
Decompilation in Practice: Reversing helpctr.exe
The following example illustrates a reverse engineering session against helpctr.exe, a Microsoft program provided with the Windows XP OS. The program happens to have a security vulnerability known as a buffer overflow. This particular vulnerability was made public quite some time ago, so revealing it here does not pose a real security threat. What is important for our purposes is describing the process of revealing the fault through reverse engineering. We use IDA-Pro to disassemble the target software. The target program produces a special debug file called a Dr. Watson log. We use only IDA and the information in the debug log to locate the exact coding error that caused the problem. Note that no source code is publicly available for the target software. Figure 3-4 shows IDA in action.
Figure 3-4 A screen shot of IDA-Pro reverse assembling the program helpctr.exe, which is included as part of the Microsoft Windows XP OS. As an exercise, we explore helpctr.exe for a buffer overflow vulnerability.
Bug Report
We learned of this vulnerability just like most people did, by reading a bug report posted to bugtraq, an industry mailing list forum where software problems and security issues are discussed. The report revealed only minor details about the problem. Most notably, the name of the executable and the input that caused the fault. The report revealed that the URL hcp://w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w.w., when supplied to Internet Explorer, caused helpctr.exe to launch. The URL does this by causing an application exception (which can be tickled remotely through a Web browser).
We recreate the fault by using the URL as input in a Windows XP environment. A debug log is created by the OS and we then copy the debug log and the helpctr.exe binary to a separate machine for analysis. Note that we used an older Windows NT machine to perform the analysis of this bug. The original XP environment is no longer required once we induce the error and gather the data we need.
The Debug Log
A debug dump is created when the program crashes. A stack trace is included in this log, giving us a hint regarding the location of the faulty code:
0006f8ac 0100b4ab 0006f8d8 00120000 00000103 msvcrt!wcsncat+0x1e 0006fae4 0050004f 00120000 00279b64 00279b44 HelpCtr+0xb4ab 0054004b 00000000 00000000 00000000 00000000 0x50004f
The culprit appears to be string concatenation function called wcsncat. The stack dump clearly shows our (fairly straightforward) URL string. We can see that the URL string dominates the stack space and thereby overflows other values:
*----> Raw Stack Dump <----* 000000000006f8a8 03 01 00 00 e4 fa 06 00 - ab b4 00 01 d8 f8 06 00 ................ 000000000006f8b8 00 00 12 00 03 01 00 00 - d8 f8 06 00 a8 22 03 01 .............".. 000000000006f8c8 f9 00 00 00 b4 20 03 01 - cc 9b 27 00 c1 3e c4 77 ..... ....'..>.w 000000000006f8d8 43 00 3a 00 5c 00 57 00 - 49 00 4e 00 44 00 4f 00 C.:.\.W.I.N.D.O. 000000000006f8e8 57 00 53 00 5c 00 50 00 - 43 00 48 00 65 00 61 00 W.S.\.P.C.H.e.a. 000000000006f8f8 6c 00 74 00 68 00 5c 00 - 48 00 65 00 6c 00 70 00 l.t.h.\.H.e.l.p. 000000000006f908 43 00 74 00 72 00 5c 00 - 56 00 65 00 6e 00 64 00 C.t.r.\.V.e.n.d. 000000000006f918 6f 00 72 00 73 00 5c 00 - 77 00 2e 00 77 00 2e 00 o.r.s.\.w...w... 000000000006f928 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f938 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f948 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f958 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f968 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f978 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f988 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f998 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f9a8 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f9b8 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f9c8 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w... 000000000006f9d8 77 00 2e 00 77 00 2e 00 - 77 00 2e 00 77 00 2e 00 w...w...w...w...
Knowing that wcsncat is the likely culprit, we press onward with our analysis. Using IDA, we can see that wcsncat is called from two locations:
.idata:01001004 extrn wcsncat:dword ; DATA XREF: sub_100B425+62⅓r .idata:01001004 ; sub_100B425+77⅓r ...
The behavior of wcsncat is straightforward and can be obtained from a manual. The call takes three parameters:
-
A destination buffer (a buffer pointer)
-
A source string (user supplied)
-
A maximum number of characters to append
The destination buffer is supposed to be large enough to store all the data being appended. (But note that in this case the data are supplied by an outside user, who might be malicious.) This is why the last argument lets the programmer specify the maximum length to append. Think of the buffer as a glass of a particular size, and the subroutine we're calling as a method for adding liquid to the glass. The last argument is supposed to guarantee that the glass does not overflow.
In helpctr.exe, a series of calls are made to wcsncat from within the broken subroutine. The following diagram illustrates the behavior of multiple calls to wcsncat. Assume the destination buffer is 12 characters long and we have already inserted the string ABCD. This leaves a total of eight remaining characters including the terminating NULL character.
wcsncat(target_buffer, "ABCD", 11);
We now make a call to wcsncat() and append the string EF. As the following diagram illustrates, the string is appended to the destination buffer starting at the NULL character. To protect the destination buffer, we must specify that a maximum of seven characters are to be appended. If the terminating NULL character is included, this makes a total of eight. Any more input will write off the end of our buffer and we will have a buffer overflow.
wcsncat(target_buffer, "EF", 7);
Unfortunately, in the faulty subroutine within helpctr.exe, the programmer made a subtle but fatal mistake. Multiple calls are made to wscncat() but the maximum-length value is never recalculated. In other words, the multiple appends never account for the ever-shrinking space remaining at the end of the destination buffer. The glass is getting full, but nobody is watching as more liquid is poured in. In our illustration, this would be something like appending EFGHIJKLMN to our example buffer, using the maximum length of 11 characters (12 including the NULL). The correct value should be a maximum of seven characters, but we never correct for this and we append past the end of our buffer.
wcsncat(target_buffer, "EFGHIJKLMN", 11);
A graph of the subroutine in helpctr.exe that makes these calls is shown in Figure 3-5.
Figure 3-5 A simple graph of the subroutine in helpctr.exe that makes calls to wcsncat().
A very good reverse engineer can spot and decode the logic that causes this problem in 10 to 15 minutes. An average reverse engineer might be able to reverse the routine in about an hour. The subroutine starts out by checking that it has not been passed a NULL buffer. This is the first JZ branch. If the buffer is valid, we can see that 103h is being set in a register. This is 259 decimalmeaning we have a maximum buffer size of 259 characters. [13] And herein lies the bug. We see that this value is never updated during successive calls to wcsncat. Strings of characters are appended to the target buffer multiple times, but the maximum allowable length is never appropriately reduced. This type of bug is very typical of parsing problems often found in code. Parsing typically includes lexical and syntax analysis of user-supplied strings, but it unfortunately often fails to maintain proper buffer arithmetic.
What is the final conclusion here? A user-supplied variablein the URL used to spawn helpctr.exeis passed down to this subroutine, which subsequently uses the data in a buggy series of calls for string concatenation.
Alas, yet another security problem in the world caused by sloppy code. We leave an exploit resulting in machine compromise as an exercise for you to undertake.