Operators
Operators can produce unanticipated results. As you have seen, unsanitized operands used in simple arithmetic operations can potentially open security holes in applications. These exposures are generally the result of crossing over boundary conditions that affect the meaning of the result. In addition, each operator has associated type promotions that are performed on each of its operands implicitly which could produce some unexpected results. Because producing unexpected results is the essence of vulnerability discovery, it's important to know how these results might be produced and what exceptional conditions could occur. The following sections highlight these exceptional conditions and explain some common misuses of operators that could lead to potential vulnerabilities.
The sizeof Operator
The first operator worth mentioning is sizeof. It's used regularly for buffer allocations, size comparisons, and size parameters to length-oriented functions. The sizeof operator is susceptible to misuse in certain circumstances that could lead to subtle vulnerabilities in otherwise solid-looking code.
One of the most common mistakes with sizeof is accidentally using it on a pointer instead of its target. Listing 6-24 shows an example of this error.
Listing 6-24. Sizeof Misuse Vulnerability Example
char *read_username(int sockfd) { char *buffer, *style, userstring[1024]; int i; buffer = (char *)malloc(1024); if(!buffer){ error("buffer allocation failed: %m"); return NULL; } if(read(sockfd, userstring, sizeof(userstring)-1) <= 0){ free(buffer); error("read failure: %m"); return NULL; } userstring[sizeof(userstring)-1] = '\0'; style = strchr(userstring, ':'); if(style) *style++ = '\0'; sprintf(buffer, "username=%.32s", userstring); if(style) snprintf(buffer, sizeof(buffer)-strlen(buffer)-1, ", style=%s\n", style); return buffer; }
In this code, some user data is read in from the network and copied into the allocated buffer. However, sizeof is used incorrectly on buffer. The intention is for sizeof(buffer) to return 1024, but because it's used on a character pointer type, it returns only 4! This results in an integer underflow condition in the size parameter to snprintf() when a style value is present; consequently, an arbitrary amount of data can be written to the memory pointed to by the buffer variable. This error is quite easy to make and often isn't obvious when reading code, so pay careful attention to the types of variables passed to the sizeof operator. They occur most frequently in length arguments, as in the preceding example, but they can also occur occasionally when calculating lengths for allocating space. The reason this type of bug is somewhat rare is that the misallocation would likely cause the program to crash and, therefore, get caught before release in many applications (unless it's in a rarely traversed code path).
sizeof() also plays an integral role in signed and unsigned comparison bugs (explored in the "Comparison" section previously in this chapter) and structure padding issues (explored in "Structure Padding" later in this chapter).
Unexpected Results
You have explored two primary idiosyncrasies of arithmetic operators: boundary conditions related to the storage of integer types and issues caused by conversions that occur when arithmetic operators are used in expressions. A few other nuances of C can lead to unanticipated behaviors, specifically nuances related to underlying machine primitives being aware of signed-ness. If a result is expected to fall within a specific range, attackers can sometimes violate those expectations.
Interestingly enough, on twos complement machines, there are only a few operators in C in which the signed-ness of operands can affect the result of the operation. The most important operators in this group are comparisons. In addition to comparisons, only three other C operators have a result that's sensitive to whether operands are signed: right shift (>>), division (/), and modulus (%). These operators can produce unexpected negative results when they're used with signed operands because of their underlying machine-level operations being sign-aware. As a code reviewer, you should be on the lookout for misuse of these operators because they can produce results that fall outside the range of expected values and catch developers off-guard.
The right shift operator (>>) is often used in applications in place of the division operator (when dividing by powers of 2). Problems can happen when using this operator with a signed integer as the left operand. When right-shifting a negative value, the sign of the value is preserved by the underlying machine performing a sign-extending arithmetic shift. This sign-preserving right shift is shown in Listing 6-25.
Listing 6-25. Sign-Preserving Right Shift
signed char c = 0x80; c >>= 4; 1000 0000 – value before right shift 1111 1000 – value after right shift
Listing 6-26 shows how this code might produce an unexpected result that leads to a vulnerability. It's close to an actual vulnerability found recently in client code.
Listing 6-26. Right Shift Vulnerability Example
int print_high_word(int number) { char buf[sizeof("65535")]; sprintf(buf, "%u", number >> 16); return 0; }
This function is designed to print a 16-bit unsigned integer (the high 16 bits of the number argument). Because number is signed, the right shift sign-extends number by 16 bits if it's negative. Therefore, the %u specifier to sprintf() has the capability of printing a number much larger than sizeof("65535"), the amount of space allocated for the destination buffer, so the result is a buffer overflow. Vulnerable right shifts are good examples of bugs that are difficult to locate in source code yet readily visible in assembly code. In Intel assembly code, a signed, or arithmetic, right shift is performed with the sar mnemonic. A logical, or unsigned, right shift is performed with the shr mnemonic. Therefore, analyzing the assembly code can help you determine whether a right shift is potentially vulnerable to sign extension. Table 6-9 shows signed and unsigned right-shift operations in the assembly code.
Table 6-9. Signed Versus Unsigned Right-Shift Operations in Assembly
Signed Right-Shift Operations |
Unsigned Right-Shift Operations |
mov eax, [ebp+8] |
mov eax, [ebp+8] |
sar eax, 16 |
shr eax, 16 |
push eax |
push eax |
push offset string |
push offset string |
lea eax, [ebp+var_8] |
lea eax, [ebp+var_8] |
push eax |
push eax |
call sprintf |
call sprintf |
Division (/) is another operator that can produce unexpected results because of sign awareness. Whenever one of the operands is negative, the resulting quotient is also negative. Often, applications don't account for the possibility of negative results when performing division on integers. Listing 6-27 shows how using negative operands could create a vulnerability with division.
Listing 6-27. Division Vulnerability Example
int read_data(int sockfd) { int bitlength; char *buffer; bitlength = network_get_int(length); buffer = (char *)malloc(bitlength / 8 + 1); if (buffer == NULL) die("no memory"); if(read(sockfd, buffer, bitlength / 8) < 0){ error("read error: %m"); return -1; } return 0; }
Listing 6-27 takes a bitlength parameter from the network and allocates memory based on it. The bitlength is divided by 8 to obtain the number of bytes needed for the data that's subsequently read from the socket. One is added to the result, presumably to store extra bits in if the supplied bitlength isn't a multiple of 8. If the division can be made to return -1, the addition of 1 produces 0, resulting in a small amount of memory being allocated by malloc(). Then the third argument to read() would be -1, which would be converted to a size_t and interpreted as a large positive value.
Similarly, the modulus operator (%) can produce negative results when dealing with a negative dividend operand. Code auditors should be on the lookout for modulus operations that don't properly sanitize their dividend operands because they could produce negative results that might create a security exposure. Modulus operators are often used when dealing with fixed-sized arrays (such as hash tables), so a negative result could immediately index before the beginning of the array, as shown in Listing 6-28.
Listing 6-28. Modulus Vulnerability Example
#define SESSION_SIZE 1024 struct session { struct session *next; int session_id; } struct header { int session_id; ... }; struct session *sessions[SESSION_SIZE]; struct session *session_new(int session_id) { struct session *new1, *tmp; new1 = malloc(sizeof(struct session)); if(!new1) die("malloc: %m"); new1->session_id = session_id; new1->next = NULL; if(!sessions[session_id%(SESSION_SIZE-1)]) { sessions[session_id%(SESSION_SIZE-1] = new1; return new1; } for(tmp = sessions[session_id%(SESSION_SIZE-1)]; tmp->next; tmp = tmp->next); tmp->next = new1; return new1; } int read_packet(int sockfd) { struct session *session; struct header hdr; if(full_read(sockfd, (void *)&hdr, sizeof(hdr)) != sizeof(hdr)) { error("read: %m"); return –1; } if((session = session_find(hdr.session_id)) == NULL) { session = session_new(hdr.sessionid); return 0; } ... validate packet with session ... return 0; }
As you can see, a header is read from the network, and session information is retrieved from a hash table based on the header's session identifier field. The sessions are stored in the sessions hash table for later retrieval by the program. If the session identifier is negative, the result of the modulus operator is negative, and out-of-bounds elements of the sessions array are indexed and possibly written to, which would probably be an exploitable condition.
As with the right-shift operator, unsigned and signed divide and modulus operations can be distinguished easily in Intel assembly code. The mnemonic for the unsigned division instruction is div and its signed counterpart is idiv. Table 6-10 shows the difference between signed and unsigned divide operations. Note that compilers often use right-shift operations rather than division when the divisor is a constant.
Table 6-10. Signed Versus Unsigned Divide Operations in Assembly
Signed Divide Operations |
Unsigned Divide Operations |
mov eax, [ebp+8] |
mov eax, [ebp+8] |
mov ecx, [ebp+c] |
mov ecx, [ebp+c] |
cdq |
cdq |
idiv ecx |
div ecx |
ret |
ret |