- Into the House of Logic
- Should Reverse Engineering Be Illegal?
- Reverse Engineering Tools and Concepts
- Approaches to Reverse Engineering
- Methods of the Reverser
- Writing Interactive Disassembler (IDA) Plugins
- Decompiling and Disassembling Software
- Decompilation in Practice: Reversing helpctr.exe
- Automatic, Bulk Auditing for Vulnerabilities
- Writing Your Own Cracking Tools
- Building a Basic Code Coverage Tool
- Conclusion
Building a Basic Code Coverage Tool
As we mentioned early in the chapter, all the available coverage tools, commercial or otherwise, lack significant features and data visualization methods that are important to the attacker. Instead of fighting with expensive and deficient tools, why not write your own? In this section we present one of the jewels of this booka simple code coverage tool that can be designed using the debugging API calls that are described elsewhere in this book. The tool should track all conditional branches in the code. If the conditional branch can be controlled by user-supplied input, this should be noted. Of course, the goal is to determine whether the input set has exercised all possible branches that can be controlled.
For the purposes of this example, the tool will run the processor in single-step mode and will track each instruction using a disassembler. The core object we are tracking is a code location. A location is a single continuous block of instructions with no branches. Branch instructions connect all the code locations together. That is, one code location branches to another code location. We want to track all the code locations that have been visited and determine whether user-supplied input is being processed in the code location. The structure we are using to track code locations is as follows:
// A code location struct item { item() { subroutine=FALSE; is_conditional=FALSE; isret=FALSE; boron=FALSE; address=0; length=1; x=0; y=0; column=0; m_hasdrawn=FALSE; } bool subroutine; bool is_conditional; bool isret; bool boron; bool m_hasdrawn; // To stop circular references int address; int length; int column; int x; int y; std::string m_disasm; std::string m_borons; std::list<struct item *> mChildren; struct item * lookup(DWORD addr) { std::list<item *>::iterator i = mChildren.begin(); while(i != mChildren.end()) { struct item *g = *i; if(g->address == addr) return g; i++; } return NULL; } };
Each location has a list of pointers to all branch targets from the location. It also has a string that represents the assembly instructions that make up the location. The following code executes on each single-step event:
struct item *anItem = NULL; // Make sure we have a fresh context. theThread->GetThreadContext(); // Disassemble the target instruction. m_disasm.Disasm( theThread ); // Determine if this is the target of a branch instruction. if(m_next_is_target || m_next_is_calltarget) { anItem = OnBranchTarget( theThread ); SetCurrentItemForThread( theThread->m_thread_id, anItem); m_next_is_target = FALSE; m_next_is_calltarget = FALSE; // We have branched, so we need to set the parent/child // lists. if(old_item) { // Determine if we are already in the child. if(NULL == old_item->lookup(anItem->address)) { old_item->mChildren.push_back(anItem); } } } else { anItem = GetCurrentItemForThread( theThread->m_thread_id ); } if(anItem) { anItem->m_disasm += m_disasm.m_instruction; anItem->m_disasm += '\n'; } char *_c = m_disasm.m_instruction; if(strstr(_c, "call")) { m_next_is_calltarget = TRUE; } else if(strstr(_c, "ret")) { m_next_is_target = TRUE; if(anItem) anItem->isret = TRUE; } else if(strstr(_c, "jmp")) { m_next_is_target = TRUE; } else if(strstr(_c, "je")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jne")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jl")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jle")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jz")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jnz")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jg")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else if(strstr(_c, "jge")) { m_next_is_target = TRUE; if(anItem)anItem->is_conditional=TRUE; } else { // Not a branching instruction, // so add one to the current item length. if(anItem) anItem->length++; } ////////////////////////////////////////////// // Check for boron tag. ////////////////////////////////////////////// if(anItem && mTagLen) { if(check_boron(theThread, _c, anItem)) anItem->boron = TRUE; } old_item = anItem;
First, we see the code gets a fresh context structure for the thread that just single stepped. The instruction pointed to by the instruction pointer is disassembled. If the instruction is the beginning of a new code location, the list of currently mapped locations is queried so that we don't make double entries. The instruction is then compared with a list of known branching instructions, and appropriate flags are set in the item structure. Finally, a check is made for boron tags. The code for a boron tag check is presented in the following paragraph.
Checking for Boron Tags
When a breakpoint or single-step event has occurred, the debugger may wish to query memory for boron tags (that is, substrings that are known to be user supplied). Using the memory query routines introduced earlier in the book, we can make some fairly intelligent queries for boron tags. Because CPU registers are used constantly to store pointers to data, it makes sense to check all the CPU registers for valid memory pointers when the breakpoint or single step has occurred. If the register points to valid memory, we can then query that memory and look for a boron tag. The fact is that any code location that is using user-supplied data typically has a pointer to these data in one of the registers. To check the registers, you can use a routine like this:
bool check_boron( CDThread *theThread, char *c, struct item *ip ) { // If any of the registers point to the user buffer, tag this. DWORD reg; if(strstr(c, "eax")) { reg = theThread->m_ctx.Eax; if(can_read( theThread, (void *)reg )) { SIZE_T lpRead; char string[255]; string[mTagLen]=NULL; // Read the target memory. if(ReadProcessMemory( theThread->m_hProcess, (void *)reg, string, mTagLen, &lpRead)) { if(strstr( string, mBoronTag )) { // Found the boron string. ip->m_borons += "EAX: "; ip->m_borons += c; ip->m_borons += " > "; ip->m_borons += string; ip->m_borons += '\n'; return TRUE; } } } } .... // Repeat this call for all the registers EAX, EBX, ECX, EDX, ESI, and EDI. return FALSE; }
To save room, we didn't paste the code for all registers, just the EAX register. The code should query all registers listed in the comment. The function returns TRUE if the supplied boron tag is found behind one of the memory pointers.