2.7 Arc Injection
The first exploit for the get password program, described in Section 2.5, modified the return address to change the control flow of the program (in this case, to circumvent the password protection logic). This technique, which is known as arc injection (sometimes referred to as return-into-libc), involves transferring control to code that already exists in the program's memory space. Arc injection refers to how these exploits insert a new arc (control-flow transfer) into the program's control-flow graph as opposed to injecting code. More sophisticated attacks are possible using this technique, including installing the address of an existing function (such as system() or exec(), which can be used to execute commands and other programs already on the local system) on the stack along with the appropriate arguments. When the return address is popped off the stack (by the ret or iret instruction in IA-32), control is "returned" to an attacker-specified function. By invoking functions like system() or exec(), an attacker could easily create a shell on the compromised machine with the permissions of the compromised program.
Worse yet, an attacker can use arc injection to invoke multiple functions in sequence with arguments that are also supplied by the attacker. An attacker can now install and run the equivalent of a small program that includes chained functions, increasing the severity of these attacks.
Figure 2–27 contains a program that is vulnerable to a buffer overflow. User-supplied data in user_input is copied to the buff character array on line 4 using memcpy().8A buffer overflow can result if user_input is larger than the
1. #include <string.h> 2. int get_buff(char *user_input){ 3. char buff[4]; 4. memcpy(buff, user_input, sizeof(user_input)); 5. return 0; 6. } 7. int main(int argc, char *argv[]){ 8. get_buff(argv[1]); 9. return 0; 10. }
Figure 2–27. Program vulnerable to arc injection exploit
buff buffer. Figure 2–28 (a) shows the contents of the stack before execution of the get_buff() function. The stack consists of the local variable buff, followed by the frame pointer (ebp) and return address (eip) for main(). Below this is the actual stack frame for main() (which is referenced by the stored frame pointer).
Figure 2–28 (b) shows the contents of the stack after an attacker has overflowed buff to overwrite the contents of the stack. This portion of the stack has been completely overwritten by the overflow.
An attacker may be able to place data in the actual buffer, but in this example we'll assume that the buffer is overwritten with fill characters. The frame pointer for main() has been overwritten with a frame pointer for Frame 2. This entire frame has been manufactured by the attacker as part of the exploit. When the exploited function (get_buff()) returns, it executes one of two equivalent forms of the frame pointer-based return sequences shown in Figure 2–28 (c). Regardless of which form is used, the frame pointer (now pointing to Frame 2) is moved into the stack pointer. Control is returned to the address on the stack, which has been overwritten with the address of an arbitrary function f(). This function is called and passed the arguments installed on the stack. The attacker must provide the appropriate number and type of arguments assumed by the invoked function. In Figure 2–28 (b), we assume that the function accepts a pointer to a string (for example, "system()"). Because the actual contents of the string also need to be provided, the string is placed on the stack after the actual arguments to the function.
Figure 2–28. Arc injection exploit
When f() returns, it pops the stored eip off the stack and transfers control to this address. In this case, the eip has been overwritten with the address of the return sequence shown in Figure 2–28 (c). This sequence is usually the instructions generated for the return to the exploited function, but it can appear anywhere in the code segment for the process. The return sequence assigns the frame pointer (now pointing to Frame 3) to the stack pointer and returns control to the the next arbitrary function to be called (in this case, g()).
An attacker can repeat this sequence as required to invoke a sequence of functions to accomplish an exploit. The attacker could also reproduce the original frame contents on the stack to return control to main() after the exploit has executed.
An attacker may prefer arc injection over code injection for several reasons. Because arc injection uses code already in memory on the target system, the attacker merely needs to provide the addresses of the functions and arguments for a successful attack. The footprint for this type of attack can be significantly smaller and may be used to exploit vulnerabilities that cannot be exploited by the code injection technique. Arc injection is a data-based attack that cannot be defeated by making memory segments (such as the stack) nonexecutable.
Chaining function calls together allows for more powerful attacks. A security-conscious programmer, for example, might follow the principle of least privilege [Saltzer 75] and drop privileges when not required. By chaining multiple function calls together, an exploit could regain privileges, for example, by calling setuid() before calling system().