2.6. Runtime Protection Strategies
Detection and Recovery
Detection and recovery mitigation strategies generally make changes to the runtime environment to detect buffer overflows when they occur so that the application or operating system can recover from the error (or at least fail safely). Because attackers have numerous options for controlling execution after a buffer overflow occurs, detection and recovery are not as effective as prevention and should not be relied on as the only mitigation strategy. However, detection and recovery mitigations generally form a second line of defense in case the “outer perimeter” is compromised. There is a danger that programmers can believe they have solved the problem by using an incomplete detection and recovery strategy, giving them false confidence in vulnerable software. Such strategies should be employed and then forgotten to avoid such biases.
Buffer overflow mitigation strategies can be classified according to which component of the entire system provides the mitigation mechanism:
- The developer via input validation
- The compiler and its associated runtime system
- The operating system
Input Validation
The best way to mitigate buffer overflows is to prevent them. Doing so requires developers to prevent string or memory copies from overflowing their destination buffers. Buffer overflows can be prevented by ensuring that input data does not exceed the size of the smallest buffer in which it is stored. Example 2.15 is a simple function that performs input validation.
Example 2.15. Input Validation
1 void f(const char *arg) { 2 char buff[100]; 3 if (strlen(arg) >= sizeof(buff)) { 4 abort(); 5 } 6 strcpy(buff, arg); 7 /* ... */ 8 }
Any data that arrives at a program interface across a trust boundary requires validation. Examples of such data include the argv and argc arguments to function main() and environment variables, as well as data read from sockets, pipes, files, signals, shared memory, and devices.
Although this example is concerned only with string length, many other types of validation are possible. For example, input that is meant to be sent to a SQL database will require validation to detect and prevent SQL injection attacks. If the input may eventually go to a Web page, it should also be validated to guard against cross-site scripting (XSS) attacks.
Fortunately, input validation works for all classes of string exploits, but it requires that developers correctly identify and validate all of the external inputs that might result in buffer overflows or other vulnerabilities. Because this process is error prone, it is usually prudent to combine this mitigation strategy with others (for example, replacing suspect functions with more secure ones).
Object Size Checking
The GNU C Compiler (GCC) provides limited functionality to access the size of an object given a pointer into that object. Starting with version 4.1, GCC introduced the __builtin_object_size() function to provide this capability. Its signature is size_t __builtin_object_size(void *ptr, int type). The first argument is a pointer into any object. This pointer may, but is not required to, point to the start of the object. For example, if the object is a string or character array, the pointer may point to the first character or to any character in the array’s range. The second argument provides details about the referenced object and may have any value from 0 to 3. The function returns the number of bytes from the referenced byte to the final byte of the referenced object.
This function is limited to objects whose ranges can be determined at compile time. If GCC cannot determine which object is referenced, or if it cannot determine the size of this object, then this function returns either 0 or –1, both invalid sizes. For the compiler to be able to determine the size of the object, the program must be compiled with optimization level -O1 or greater.
The second argument indicates details about the referenced object. If this argument is 0 or 2, then the referenced object is the largest object containing the pointed-to byte; otherwise, the object in question is the smallest object containing the pointed-to byte. To illustrate this distinction, consider the following code:
struct V { char buf1[10]; int b; char buf2[10]; } var; void *ptr = &var.b;
If ptr is passed to __builtin_object_size() with type set to 0, then the value returned is the number of bytes from var.b to the end of var, inclusive. (This value will be at least the sum of sizeof(int) and 10 for the buf2 array.) However, if type is 1, then the value returned is the number of bytes from var.b to the end of var.b, inclusive (that is, sizeof(int)).
If __builtin_object_size() cannot determine the size of the pointed-to object, it returns (size_t) -1 if the second argument is 0 or 1. If the second argument is 2 or 3, it returns (size_t) 0. Table 2.9 summarizes how the type argument affects the behavior of __builtin_object_size().
Table 2.9. Behavior Effects of type on __builtin_object_size()
Value of type Argument |
Operates on |
If Unknown, Returns |
0 |
Maximum object |
(size_t) -1 |
1 |
Minimum object |
(size_t) -1 |
2 |
Maximum object |
(size_t) 0 |
3 |
Minimum object |
(size_t) 0 |
Use of Object Size Checking
The __builtin_object_size() function is used to add lightweight buffer overflow protection to the following standard functions when _FORTIFY_SOURCE is defined:
memcpy() strcpy() strcat() sprintf() vsprintf() memmove() strncpy() strncat() snprintf() vsnprintf() memset() fprintf() vfprintf() printf() vprintf()
Many operating systems that support GCC turn on object size checking by default. Others provide a macro (such as _FORTIFY_SOURCE) to enable the feature as an option. On Red Hat Linux, for example, no protection is performed by default. When _FORTIFY_SOURCE is set at optimization level 1 (_FORTIFY_SOURCE=1) or higher, security measures that should not change the behavior of conforming programs are taken. _FORTIFY_SOURCE=2 adds some more checking, but some conforming programs might fail.
For example, the memcpy() function may be implemented as follows when _FORTIFY_SOURCE is defined:
1 __attribute__ ((__nothrow__)) memcpy( 2 void * __restrict __dest, 3 __const void * __restrict __src, 4 size_t __len 5 ) { 6 return ___memcpy_chk( 7 __dest, __src, __len, __builtin_object_size(__dest, 0) 8 ); 9 }
When using the memcpy() and strcpy() functions, the following behaviors are possible:
- The following case is known to be correct:
1 char buf[5]; 2 memcpy(buf, foo, 5); 3 strcpy(buf, "abcd");
No runtime checking is needed, and consequently the memcpy() and strcpy() functions are called.
- The following case is not known to be correct but is checkable at runtime:
1 memcpy(buf, foo, n); 2 strcpy(buf, bar);
The compiler knows the number of bytes remaining in the object but does not know the length of the actual copy that will happen. Alternative functions __memcpy_chk() or __strcpy_chk() are used in this case; these functions check whether buffer overflow happened. If buffer overflow is detected, __chk_fail() is called and typically aborts the application after writing a diagnostic message to stderr.
- The following case is known to be incorrect:
1 memcpy(buf, foo, 6); 2 strcpy(buf, "abcde");
The compiler can detect buffer overflows at compile time. It issues warnings and calls the checking alternatives at runtime.
- The last case is when the code is not known to be correct and is not checkable at runtime:
1 memcpy(p, q, n); 2 strcpy(p, q);
The compiler does not know the buffer size, and no checking is done. Overflows go undetected in these cases.
Learn More: Using _builtin_object_size()
This function can be used in conjunction with copying operations. For example, a string may be safely copied into a fixed array by checking for the size of the array:
01 char dest[BUFFER_SIZE]; 02 char *src = /* valid pointer */; 03 size_t src_end = __builtin_object_size(src, 0); 04 if (src_end == (size_t) -1 && /* don't know if src is too big */ 05 strlen(src) < BUFFER_SIZE) { 06 strcpy(dest, src); 07 } else if (src_end <= BUFFER_SIZE) { 08 strcpy(dest, src); 09 } else { 10 /* src would overflow dest */ 11 }
The advantage of using __builtin_object_size() is that if it returns a valid size (instead of 0 or –1), then the call to strlen() at runtime is unnecessary and can be bypassed, improving runtime performance.
GCC implements strcpy() as an inline function that calls __builtin___strcpy_chk() when _FORTIFY_SOURCE is defined. Otherwise, strcpy() is an ordinary glibc function. The __builtin___strcpy_chk() function has the following signature:
char *__builtin___strcpy_chk(char *dest, const char *src, size_t dest_end)
This function behaves like strcpy(), but it first checks that the dest buffer is big enough to prevent buffer overflow. This is provided via the dest_end parameter, which is typically the result of a call to __builtin_object_size(). This check can often be performed at compile time. If the compiler can determine that buffer overflow never occurs, it can optimize away the runtime check. Similarly, if the compiler determines that buffer overflow always occurs, it issues a warning, and the call aborts at runtime. If the compiler knows the space in the destination string but not the length of the source string, it adds a runtime check. Finally, if the compiler cannot guarantee that adequate space exists in the destination string, then the call devolves to standard strcpy() with no check added.
Visual Studio Compiler-Generated Runtime Checks
The MS Visual Studio C++ compiler provides several options to enable certain checks at runtime. These options can be enabled using a specific compiler flag. In particular, the /RTCs compiler flag turns on checks for the following errors:
- Overflows of local variables such as arrays (except when used in a structure with internal padding)
- Use of uninitialized variables
- Stack pointer corruption, which can be caused by a calling convention mismatch
These flags can be tweaked on or off for various regions in the code. For example, the following pragma:
#pragma runtime_checks("s", off)
turns off the /RTCs flag checks for any subsequent functions in the code. The check may be restored with the following pragma:
#pragma runtime_checks("s", restore)
Runtime Bounds Checkers
Although not publicly available, some existing C language compiler and runtime systems do perform array bounds checking.
Libsafe and Libverify
Libsafe, available from Avaya Labs Research, is a dynamic library for limiting the impact of buffer overflows on the stack. The library intercepts and checks the bounds of arguments to C library functions that are susceptible to buffer overflow. The library makes sure that frame pointers and return addresses cannot be overwritten by an intercepted function. The Libverify library, also described by Baratloo and colleagues [Baratloo 2000], implements a return address verification scheme similar to Libsafe’s but does not require recompilation of source code, which allows it to be used with existing binaries.
CRED
Richard Jones and Paul Kelley [Jones 1997] proposed an approach for bounds checking using referent objects. This approach is based on the principle that an address computed from an in-bounds pointer must share the same referent object as the original pointer. Unfortunately, a surprisingly large number of programs generate and store out-of-bounds addresses and later retrieve these values in their computation without causing buffer overflows, making these programs incompatible with this bounds-checking approach. This approach to runtime bounds checking also has significant performance costs, particularly in pointer-intensive programs in which performance may slow down by up to 30 times [Cowan 2000].
Olatunji Ruwase and Monica Lam [Ruwase 2004] improved the Jones and Kelley approach in their C range error detector (CRED). According to the authors, CRED enforces a relaxed standard of correctness by allowing program manipulations of out-of-bounds addresses that do not result in buffer overflows. This relaxed standard of correctness provides greater compatibility with existing software.
CRED can be configured to check all bounds of all data or of string data only. Full bounds checking, like the Jones and Kelley approach, imposes significant performance overhead. Limiting the bounds checking to strings improves the performance for most programs. Overhead ranges from 1 percent to 130 percent depending on the use of strings in the application.
Bounds checking is effective in preventing most overflow conditions but is not perfect. The CRED solution, for example, cannot detect conditions under which an out-of-bounds pointer is cast to an integer, used in an arithmetic operation, and cast back to a pointer. The approach does prevent overflows in the stack, heap, and data segments. CRED, even when optimized to check only for overflows in strings, was effective in detecting 20 different buffer overflow attacks developed by John Wilander and Mariam Kamkar [Wilander 2003] for evaluating dynamic buffer overflow detectors.
CRED has been merged into the latest Jones and Kelley checker for GCC 3.3.1, which is currently maintained by Herman ten Brugge.
Dinakar Dhurjati and Vikram Adve proposed a collection of improvements, including pool allocation, which allows the compiler to generate code that knows where to search for an object in an object table at runtime [Dhurjati 2006]. Performance was improved significantly, but overhead was still as high as 69 percent.
Stack Canaries
Stack canaries are another mechanism used to detect and prevent stack-smashing attacks. Instead of performing generalized bounds checking, canaries are used to protect the return address on the stack from sequential writes through memory (for example, resulting from a call to strcpy()). Canaries consist of a value that is difficult to insert or spoof and are written to an address before the section of the stack being protected. A sequential write would consequently need to overwrite this value on the way to the protected region. The canary is initialized immediately after the return address is saved and checked immediately before the return address is accessed. A canary could consist, for example, of four different termination characters (CR, LF, NULL, and –1). The termination characters would guard against a buffer overflow caused by an unbounded strcpy() call, for example, because an attacker would need to include a null byte in his or her buffer. The canary guards against buffer overflows caused by string operations but not memory copy operations. A hard-to-spoof or random canary is a 32-bit secret random number that changes each time the program is executed. This approach works well as long as the canary remains a secret.
Canaries are implemented in StackGuard as well as in GCC’s Stack-Smashing Protector, also known as ProPolice, and Microsoft’s Visual C++ .NET as part of the stack buffer overrun detection capability.
The stack buffer overrun detection capability was introduced to the C/C++ compiler in Visual Studio .NET 2002 and has been updated in subsequent versions. The /GS compiler switch instructs the compiler to add start-up code and function epilogue and prologue code to generate and check a random number that is placed in a function’s stack. If this value is corrupted, a handler function is called to terminate the application, reducing the chance that the shellcode attempting to exploit a buffer overrun will execute correctly.
Note that Visual C++ 2005 (and later) also reorders data on the stack to make it harder to predictably corrupt that data. Examples include
- Moving buffers to higher memory than nonbuffers. This step can help protect function pointers that reside on the stack.
- Moving pointer and buffer arguments to lower memory at runtime to mitigate various buffer overrun attacks.
Visual C++ 2010 includes enhancements to /GS that expand the heuristics used to determine when /GS should be enabled for a function and when it can safely be optimized away.
To take advantage of enhanced /GS heuristics when using Visual C++ 2005 Service Pack 1 or later, add the following instruction in a commonly used header file to increase the number of functions protected by /GS:
#pragma strict_gs_check(on)
The rules for determining which functions require /GS protection are more aggressive in Visual C++ 2010 than they are in the compiler’s earlier versions; however, the strict_gs_check rules are even more aggressive than Visual C++ 2010’s rules. Even though Visual C++ 2010 strikes a good balance, strict_gs_check should be used for Internet-facing products.
To use stack buffer overrun detection for Microsoft Visual Studio, you should
- Compile your code with the most recent version of the compiler. At the time of writing, this version is VC++ 2010 (cl.exe version 16.00).
- Add #pragma string_gs_check(on) to a common header file when using versions of VC++ older than VC++ 2010.
- Add #pragma string_gs_check(on) to Internet-facing products when using VC++ 2010 and later.
- Compile with the /GS flag.
- Link with libraries that use /GS.
As currently implemented, canaries are useful only against exploits that attempt to overwrite the stack return address by overflowing a buffer on the stack. Canaries do not protect the program from exploits that modify variables, object pointers, or function pointers. Canaries cannot prevent buffer overflows from occurring in any location, including the stack segment. They detect some of these buffer overflows only after the fact. Exploits that overwrite bytes directly to the location of the return address on the stack can defeat terminator and random canaries [Bulba 2000]. To solve these direct access exploits, StackGuard added Random XOR canaries [Wagle 2003] that XOR the return address with the canary. Again, this works well for protecting the return address provided the canary remains a secret. In general, canaries offer weak runtime protection.
Stack-Smashing Protector (ProPolice)
In version 4.1, GCC introduced the Stack-Smashing Protector (SSP) feature, which implements canaries derived from StackGuard [Etoh 2000]. Also known as ProPolice, SSP is a GCC extension for protecting applications written in C from the most common forms of stack buffer overflow exploits and is implemented as an intermediate language translator of GCC. SSP provides buffer overflow detection and variable reordering to avoid the corruption of pointers. Specifically, SSP reorders local variables to place buffers after pointers and copies pointers in function arguments to an area preceding local variable buffers to avoid the corruption of pointers that could be used to further corrupt arbitrary memory locations.
The SSP feature is enabled using GCC command-line arguments. The -fstack-protector and -fno-stack-protector options enable and disable stack-smashing protection for functions with vulnerable objects (such as arrays). The -fstack-protector-all and -fno-stack-protector-all options enable and disable the protection of every function, not just the functions with character arrays. Finally, the -Wstack-protector option emits warnings about functions that receive no stack protection when -fstack-protector is used.
SSP works by introducing a canary to detect changes to the arguments, return address, and previous frame pointer in the stack. SSP inserts code fragments into appropriate locations as follows: a random number is generated for the guard value during application initialization, preventing discovery by an unprivileged user. Unfortunately, this activity can easily exhaust a system’s entropy.
SSP also provides a safer stack structure, as in Figure 2.18.
Figure 2.18. Stack-Smashing Protector (SSP) stack structure
This structure establishes the following constraints:
- Location (A) has no array or pointer variables.
- Location (B) has arrays or structures that contain arrays.
- Location (C) has no arrays.
Placing the guard after the section containing the arrays (B) prevents a buffer overflow from overwriting the arguments, return address, previous frame pointer, or local variables (but not other arrays). For example, the compiler cannot rearrange struct members so that a stack object of a type such as
1 struct S { 2 char buffer[40]; 3 void (*f)(struct S*); 4 };
would remain unprotected.
Operating System Strategies
The prevention strategies described in this section are provided as part of the platform’s runtime support environment, including the operating system and the hardware. They are enabled and controlled by the operating system. Programs running under such an environment may not need to be aware of these added security measures; consequently, these strategies are useful for executing programs for which source code is unavailable.
Unfortunately, this advantage can also be a disadvantage because extra security checks that occur during runtime can accidentally alter or halt the execution of nonmalicious programs, often as a result of previously unknown bugs in the programs. Consequently, such runtime strategies may not be applied to all programs that can be run on the platform. Certain programs must be allowed to run with such strategies disabled, which requires maintaining a whitelist of programs exempt from the strategy; unless carefully maintained, such a whitelist enables attackers to target whitelisted programs, bypassing the runtime security entirely.
Detection and Recovery
Address space layout randomization (ASLR) is a security feature of many operating systems; its purpose is to prevent arbitrary code execution. The feature randomizes the address of memory pages used by the program. ASLR cannot prevent the return address on the stack from being overwritten by a stack-based overflow. However, by randomizing the address of stack pages, it may prevent attackers from correctly predicting the address of the shellcode, system function, or return-oriented programming gadget that they want to invoke. Some ASLR implementations randomize memory addresses every time a program runs; as a result, leaked memory addresses become useless if the program is restarted (perhaps because of a crash).
ASLR reduces the probability but does not eliminate the possibility of a successful attack. It is theoretically possible that attackers could correctly predict or guess the address of their shellcode and overwrite the return pointer on the stack with this value.
Furthermore, even on implementations that randomize addresses on each invocation, ASLR can be bypassed by an attacker on a long-running process. Attackers can execute their shellcode if they can discover its address without terminating the process. They can do so, for example, by exploiting a format-string vulnerability or other information leak to reveal memory contents.
Linux
ASLR was first introduced to Linux in the PaX project in 2000. While the PaX patch has not been submitted to the mainstream Linux kernel, many of its features are incorporated into mainstream Linux distributions. For example, ASLR has been part of Ubuntu since 2008 and Debian since 2007. Both platforms allow for fine-grained tuning of ASLR via the following command:
sysctl -w kernel.randomize_va_space=2
Most platforms execute this command during the boot process. The randomize_va_space parameter may take the following values:
- 0 Turns off ASLR completely. This is the default only for platforms that do not support this feature.
- 1 Turns on ASLR for stacks, libraries, and position-independent binary programs.
- 2 Turns on ASLR for the heap as well as for memory randomized by option 1.
Windows
ASLR has been available on Windows since Vista. On Windows, ASLR moves executable images into random locations when a system boots, making it harder for exploit code to operate predictably. For a component to support ASLR, all components that it loads must also support ASLR. For example, if A.exe depends on B.dll and C.dll, all three must support ASLR. By default, Windows Vista and subsequent versions of the Windows operating system randomize system dynamic link libraries (DLLs) and executables (EXEs). However, developers of custom DLLs and EXEs must opt in to support ASLR using the /DYNAMICBASE linker option.
Windows ASLR also randomizes heap and stack memory. The heap manager creates the heap at a random location to help reduce the chances that an attempt to exploit a heap-based buffer overrun will succeed. Heap randomization is enabled by default for all applications running on Windows Vista and later. When a thread starts in a process linked with /DYNAMICBASE, Windows Vista and later versions of Windows move the thread’s stack to a random location to help reduce the chances that a stack-based buffer overrun exploit will succeed.
To enable ASLR under Microsoft Windows, you should
- Link with Microsoft Linker version 8.00.50727.161 (the first version to support ASLR) or later
- Link with the /DYNAMICBASE linker switch unless using Microsoft Linker version 10.0 or later, which enables /DYNAMICBASE by default
- Test your application on Windows Vista and later versions, and note and fix failures resulting from the use of ASLR
Nonexecutable Stacks
A nonexecutable stack is a runtime solution to buffer overflows that is designed to prevent executable code from running in the stack segment. Many operating systems can be configured to use nonexecutable stacks.
Nonexecutable stacks are often represented as a panacea in securing against buffer overflow vulnerabilities. However, nonexecutable stacks prevent malicious code from executing only if it is in stack memory. They do not prevent buffer overflows from occurring in the heap or data segments. They do not prevent an attacker from using a buffer overflow to modify a return address, variable, object pointer, or function pointer. And they do not prevent arc injection or injection of the execution code in the heap or data segments. Not allowing an attacker to run executable code on the stack can prevent the exploitation of some vulnerabilities, but it is often only a minor inconvenience to an attacker.
Depending on how they are implemented, nonexecutable stacks can affect performance. Nonexecutable stacks can also break programs that execute code in the stack segment, including Linux signal delivery and GCC trampolines.
W^X
Several operating systems, including OpenBSD, Windows, Linux, and OS X, enforce reduced privileges in the kernel so that no part of the process address space is both writable and executable. This policy is called W xor X, or more concisely W^X, and is supported by the use of a No eXecute (NX) bit on several CPUs.
The NX bit enables memory pages to be marked as data, disabling the execution of code on these pages. This bit is named NX on AMD CPUs, XD (for eXecute Disable) on Intel CPUs, and XN (for eXecute Never) on ARM version 6 and later CPUs. Most modern Intel CPUs and all current AMD CPUs now support this capability.
W^X requires that no code is intended to be executed that is not part of the program itself. This prevents the execution of shellcode on the stack, heap, or data segment. W^X also prevents the intentional execution of code in a data page. For example, a just-in-time (JIT) compiler often constructs assembly code from external data (such as bytecode) and then executes it. To work in this environment, the JIT compiler must conform to these restrictions, for example, by ensuring that pages containing executable instructions are appropriately marked.
Data Execution Prevention
Data execution prevention (DEP) is an implementation of the W^X policy for Microsoft Visual Studio. DEP uses NX technology to prevent the execution of instructions stored in data segments. This feature has been available on Windows since XP Service Pack 2. DEP assumes that no code is intended to be executed that is not part of the program itself. Consequently, it does not properly handle code that is intended to be executed in a “forbidden” page. For example, a JIT compiler often constructs assembly code from external data (such as bytecode) and then executes it, only to be foiled by DEP. Furthermore, DEP can often expose hidden bugs in software.
If your application targets Windows XP Service Pack 3, you should call SetProcessDEPPolicy() to enforce DEP/NX. If it is unknown whether or not the application will run on a down-level platform that includes support for SetProcessDEPPolicy(), call the following code early in the start-up code:
01 BOOL __cdecl EnableNX(void) { 02 HMODULE hK = GetModuleHandleW(L"KERNEL32.DLL"); 03 BOOL (WINAPI *pfnSetDEP)(DWORD); 04 05 *(FARPROC *) &pfnSetDEP = 06 GetProcAddress(hK, "SetProcessDEPPolicy"); 07 if (pfnSetDEP) 08 return (*pfnSetDEP)(PROCESS_DEP_ENABLE); 09 return(FALSE); 10 }
If your application has self-modifying code or performs JIT compilation, DEP may cause the application to fail. To alleviate this issue, you should still opt in to DEP (see the following linker switch) and mark any data that will be used for JIT compilation as follows:
01 PVOID pBuff = VirtualAlloc(NULL,4096,MEM_COMMIT,PAGE_READWRITE); 02 if (pBuff) { 03 // Copy executable ASM code to buffer 04 memcpy_s(pBuff, 4096); 05 06 // Buffer is ready so mark as executable and protect from writes 07 DWORD dwOldProtect = 0; 08 if (!VirtualProtect(pBuff,4096,PAGE_EXECUTE_READ,&dwOldProtect) 09 ) { 10 // error 11 } else { 12 // Call into pBuff 13 } 14 VirtualFree(pBuff,0,MEM_RELEASE); 15 }
DEP/NX has no performance impact on Windows. To enable DEP, you should link your code with /NXCOMPAT or call SetProcessDEPPolicy() and test your applications on a DEP-capable CPU, then note and fix any failures resulting from the use of DEP. The use of /NXCOMPAT is similar to calling SetProcessDEPPolicy() on Vista or later Windows versions. However, Windows XP’s loader does not recognize the /NXCOMPAT link option. Consequently, the use of SetProcessDEPPolicy() is generally preferred.
ASLR and DEP provide different protections on Windows platforms. Consequently, you should enable both mechanisms (/DYNAMICBASE and /NXCOMPAT) for all binaries.
PaX
In Linux, the concept of the nonexecutable stack was pioneered by the PaX kernel patch. PaX specifically labeled program memory as nonwritable and data memory as nonexecutable. PaX also provided address space layout randomization (ASLR, discussed under “Detection and Recovery”). It terminates any program that tries to transfer control to nonexecutable memory. PaX can use NX technology, if available, or can emulate it otherwise (at the cost of slower performance). Interrupting attempts to transfer control to nonexecutable memory reduces any remote-code-execution or information-disclosure vulnerability to a mere denial of service (DoS), which makes PaX ideal for systems in which DoS is an acceptable consequence of protecting information or preventing arc injection attacks. Systems that cannot tolerate DoS should not use PaX. PaX is now part of the grsecurity project, which provides several additional security enhancements to the Linux kernel.
StackGap
Many stack-based buffer overflow exploits rely on the buffer being at a known location in memory. If the attacker can overwrite the function return address, which is at a fixed location in the overflow buffer, execution of the attacker-supplied code starts. Introducing a randomly sized gap of space upon allocation of stack memory makes it more difficult for an attacker to locate a return value on the stack and costs no more than one page of real memory. This offsets the beginning of the stack by a random amount so the attacker will not know the absolute address of any item on the stack from one run of the program to the next. This mitigation can be relatively easy to add to an operating system by adding the same code to the Linux kernel that was previously shown to allow JIT compilation.
Although StackGap may make it more difficult for an attacker to exploit a vulnerability, it does not prevent exploits if the attacker can use relative, rather than absolute, values.
Other Platforms
ASLR has been partially available on Mac OS X since 2007 (10.5) and is fully functional since 2011 (10.7). It has also been functional on iOS (used for iPhones and iPads) since version 4.3.
Future Directions
Future buffer overflow prevention mechanisms will surpass existing capabilities in HP aCC, Intel ICC, and GCC compilers to provide complete coverage by combining more thorough compile-time checking with runtime checks where necessary to minimize the required overhead. One such mechanism is Safe-Secure C/C++ (SSCC).
SSCC infers the requirements and guarantees of functions and uses them to discover whether all requirements are met. For example, in the following function, n is required to be a suitable size for the array pointed to by s. Also, the returned string is guaranteed to be null-terminated.
1 char *substring_before(char *s, size_t n, char c) { 2 for (int i = 0; i < n; ++i) 3 if (s[i] == c) { 4 s[i] = '\0'; 5 return s; 6 } 7 s[0] = '\0'; 8 return s; 9 }
To discover and track requirements and guarantees between functions and source files, SSCC uses a bounds data file. Figure 2.19 shows one possible implementation of the SSCC mechanism.
Figure 2.19. A possible Safe-Secure C/C++ (SSCC) implementation
If SSCC is given the entire source code to the application, including all libraries, it can guarantee that there are no buffer overflows.