Robert Seacord on the CERT C Secure Coding Standard
I recently had the opportunity to interview Robert Seacord, author of the recently-published The CERT C Secure Coding Standard. Robert has been deeply involved with C and UNIX for longer than I've been programming in any language. Read on to find out about his work on developing C standards and his views on the future of the language.
David Chisnall: To get things started, could you give the readers a bit of background about yourself and what got you interested in secure coding?
Robert C. Seacord: I started working for IBM in 1984 in processor development (I was on the development team for S/390, among other projects). I was always interested in finding better ways to develop software, so I ended up joining the software engineering group at IBM and working on a graphical language editor written in C for the IBM PC. In 1987, I had an opportunity to join the Software Engineering Institute (SEI) where I worked on a user interface management system called Serpent. I left the SEI in 1991 to commercialize this technology, and eventually ended up at the X Consortium in Cambridge working on X11 and the Common Desktop Environment. When the XC shut its doors in 1996, I returned to the SEI and worked in the area of component-based software engineering.
I joined Computer Emergency Response Team (CERT) — which is part of the SEI — in 2003 as a vulnerability analyst. At the time, I had no idea I was interested in secure coding, but of course I had always been intrigued by the various properties of software (such as performance and portability) that are manifested by the coding choices we make. While I understood how to produce code that performed reliably under normal operations, I did not fully understand the exploits that could be launched against software and how code that appeared to be perfectly correct could be vulnerable to attack. After reading an (unpublished) technical report on the root causes of vulnerabilities, I recognized the potential for a disciplined approach to secure software development and consequently researched the material for Secure Coding in C and C++ (which was published by Addison-Wesley in September 2005).
DC: What motivated you to write a book on secure coding?
rCs: The CERT C Secure Coding Standard was the fourth book I've written, and the second on software security. The idea came up at the WG14 Berlin meeting in April of 2006 that the C programming community would benefit from CERT developing a secure coding standard. I immediately saw the wisdom of this proposal. The C99 standard is an authoritative document, but the audience for it is primarily compiler implementers and, as been noted by many, its language is obscure and often impenetrable. The CERT C Secure Coding Standard is geared towards C language programmers and provides actionable guidance on how to code securely in the language.
DC: A lot of people have given up on the idea of writing secure code in C and decided that the only solution is to modify the language, most commonly the memory model. What is your opinion on languages which try to be a safer C, like BitC and Limbo?
rCs: Secure coding in C poses some interesting challenges, but I think that the people who have given up writing secure programs in C are not maintaining huge code bases written in this language. I'm not an expert on either BitC or Limbo, and although the objectives of these languages appear admirable, it doesn't appear that either language address the legacy code issue. Another problem with these languages is that they are not industrial strength languages, meaning that the pool of skilled engineers available for these languages is small, there are a limited number of compilers available for a limited set of platforms, and there are fewer libraries available. The D programming language, on the other hand, is more widely used but is not strictly backward compatible with C source code.
A memory model for C1X is currently on the front burner of WG14.
When it comes to new development efforts, I have found that language choice is typically based on multiple criteria, and that security is rarely the determining factor.
DC: You've talked a bit about people maintaining legacy code bases, and one of the most well known examples in the open source world is one you were involved with near the start: X11. Do you think there are any lessons, good or bad, that other people with similar-aged codebases could learn from this project?
rCs: X11 and CDE were the best-managed software projects I've worked on, so there were many positive lessons. The X Consortium recruited highly competent, professional engineers and provided them with a well-defined, tool-supported development process. Because portability was a concern, we were required to test our code on multiple platforms before promoting the code. The thing that impressed me most, however, was that the management did not allow developers to get into a comfort zone. Defects were assigned based on availability and priority so we would frequently be asked to repair defects in components (and possibly languages) we weren't necessarily familiar with. At first glance, this might seem like a bad idea, but it kept the engineers sharp, engaged, and always learning. There was also a strong emphasis on quality, meaning that we were always given the time to do the job right.
DC: What other advice would you give to people maintaining a large, widely-deployed, aging, project?
rCs: The first thing to decide with an aging system is if you want to keep it running or if you want to replace it. Replacing an aging system can be an expensive and difficult process, particularly if you need to maintain continuous operations. To keep a system running, you need to allocate adequate resources to the project to support the continuous evolution of the software and include experienced software engineers on the project team. Many software defects, including defects which result in vulnerabilities, are introduced during maintenance, so it is important to maintain quality controls. Probably the most important thing during software evolution is to maintain the architectural integrity of the software. It is the beginning of the end for any system when you start to add features or capabilities to a system that are not supported by the architecture. Try to find time to perform preventative maintenance, and not just repair known defects or add new features to avoid an expensive modernization or replacement effort. (My second book, also published by Addison-Wesley, is on Modernizing Legacy Systems).
DC: You mentioned C1x a bit earlier. What is your connection with the standards process?
rCs: I am the CMU representative to INCITS PL22.11, the US committee contributing to ISO/ANSI C, and a technical expert at ISO/IEC WG14. At the WG14/INCITS J11 meeting in London, UK, April 2007 there was general agreement the committee should start thinking about what was next for a revision of the C Standard. At this same meeting, the original principles that were used for C9X were reviewed, and the observations was made that trust the programmer, as a goal, is outdated in respect to the security and safety programming communities and that the C1X version of the C Standard should take into account that programmers need the ability to check their work.
DC: What do you consider the most important proposed changes in the next version C standard?
rCs: From a safety and security perspective, I think the most important changes to C1X are those that improve the analyzability of C language programs and eliminate or reduce the impact of undefined behaviors. Many of these changes support the recommendations we have documented in The CERT C Secure Coding Standard. For example, I am working with other C language standard committee members to propose a normative but optional annex to C1X for "Error Checked" implementations. This annex would likely include the functions defined by TR 24731-1 (STR07-C. Use TR 24731 for remediation of existing string manipulation code), including the runtime constraint handling mechanism (ERR03-C. Use runtime-constraint handlers when calling functions defined by TR24731-1), as well as additional capabilities. For example, preventing run-time overflow by program logic is sometimes easy, sometimes complicated, and sometimes extremely difficult (INT30-C. Ensure that unsigned integer operations do not wrap, INT32-C. Ensure that operations on signed integers do not result in overflow). In many cases, the resulting code is much less efficient than what a compiler could generate to recognize that an overflow took place. We would like to propose that implementations which support this annex (for example, as a compiler option) would detect overflow when it occurs and report it by invoking a runtime-constraint handler before the subsequent return or function call operation. This should allow programmers and implementers to keep branches out of inner loops and not unduly impact performance. Another important change is the adoption of static assertion from the new C++ standard (see DCL03-C. Use a static assertion to test the value of a constant expression).
DC: C began life as the ultimate domain-specific language, being created to write one thing: UNIX. It grew in the '80s to be an incredibly widely-used general-purpose language, and has since had a lot of its market-share eroded by higher-level languages. Do you see C continuing to be viewed as a general-purpose language? If not, what niches do you see it continuing to serve?
rCs: Right now C is doing well in the embedded systems market. Unless something is done about software security in C, I expect to see the market for the language contract as developer's become less concerned with performance and more concerned with software security. I think The CERT C Secure Coding Standard is a good start, but we need better and cheaper source code analysis tools in the short term, and fundamental improvements to the language in the long term.
DC: On the subject of tools, are you familiar with the clang static analyzer? If so, what are your opinions on it?
rCs: LLVM/Clang static analyzer is a standalone tool that find bugs in C and Objective-C programs. It has potential for extensibility because it is based around a compiler framework, and appears to have a large community involved in support and development. At this time it seems to be pretty limited in the number of checks it does for C and the C++ support is quite unfinished. However, in some testing we've done it had a pretty low rate of false positives for those bugs that it can find.
DC: If you didn't have to worry about persuading the standards committee, what would you change about C?
rCs: I've supported a number of proposals for improving security in C. Some of these have been accepted, some have been shot down for good reason, and some are works in progress. Among the things we are working on is to add an 'x' mode character to the C99 fopen() and freopen() functions that cause fopen() to fail rather than open a file that already exists. This is necessary to eliminate a time-of-creation to time-of-use race condition vulnerability. I would also like to see something akin to the nonstandard clearenv() function that can be used to clear out the environment in an implementation-defined manner. This would ensure non-critical environmental values are removed, and critical environmental variables are set to default values. I would also like to see support for encrypted function pointers. In many systems and applications, functions pointers are stored unencrypted in locations where they can be overwritten. An attacker who can exploit a vulnerability to perform an arbitrary memory write (write an address to an address) can overwrite the function pointer and transfer control to arbitrary code when the function is called. Instead of storing a function pointer, the compiler can store an encrypted version of the pointer and decrypt the pointer before invoking the function. An attacker would need to break the encryption to redirect the pointer to malicious code.
DC: And if you were creating a new language without worrying about backwards compatibility, what would you do differently?
rCs: I actually created a new language some years ago called "Slang" for the Serpent User Interface Management System. Now that I'm older and more experienced, I don't think I would create another new language. Right now I'm more interested in fixing the ones we already have.
DC: Microsoft created "Managed C++" as a subset of C++ that could run in a typesafe virtual machine. Do you see a market for a "Managed C" that would be better suited to static analysis? If so, how would you go about designing such a language?
rCs: After completing Secure Coding in C and C++ back in September 2005, I spent some time researching C++/CLI (Common Language Infrastructure). At the time, it was difficult to generate verifiable components using /clr:safe because the available documentation was frequently unclear as to was valid for /clr:safe. In many instances I would find an example that was supposed to compile and run under /clr but would cause errors if used with /clr:safe. I did publish one article on this C++/CLI in May 2006 called Secure Coding in C++/CLI—Is buffer overflow still a problem? before I abandoned this work to focus on The CERT C Secure Coding Standard.
I guess I don't see a market for Managed C, primarily because C++/CLI never really seemed to take off, and because I think C users are even less likely to adopt a managed version of the language. I think there are better ways to accomplish the same goals, such as Safe-Secure C/C++ from Plum Hall.
DC: The vulnerability you mentioned, related function pointers, sounds like it could be mitigated by an operating system with write-xor-execute policies and completely avoided with a CPU that supports Mondrian memory protection. Are there any other features you'd like to see added in operating systems or instruction sets to make secure coding easier?
rCs: I think that Mondrian memory protection could potentially mitigate against arbitrary writes, depending on how it is implemented. For example, in the following code:
void good_function(const char *str) {...} static void (*funcPtr)(const char *str); funcPtr = &good_function; (void)(*funcPtr)("test");
Memory allocated to funcPtr needs to be writable at the time of the assignment but at no other time. Encrypting and decrypting of function pointers is currently performed by Visual C++'s C runtime libraries using the EncodePointer() and DecodePointer() functions. Currently, calls to these functions are manually inserted and are not automatically generated by the compiler.
The trend in adding runtime mitigations to operating systems is encouraging, and many of the current techniques are quite effective at raising the bar for exploit writers. Continuing the trend with randomization in even more places in the operating system (even when it doesn't mitigate currently known attacks) and the further reduction of privilege that various components run with are both features I would like to see promoted.
The instruction set question is an interesting one. This is unlikely to be the most critical change, but we recently discovered an interesting property of signed integer operations on the IA-32 instruction set. Dividing INT_MIN by -1 results in an overflow on any machine that uses 2's complement representation because the resulting value cannot be represented. The IA-32 idiv instruction generates a fault on interrupt vector 0 for any division error including overflow and divide-by-zero. What is even more interesting is the behavior of the % operator which is meant to calculate the remainder. The expression INT_MIN % -1 should result in the value 0 (the remainder when INT_MIN is divided by -1) but instead results in a fault because the idiv instruction is used to calculate the remainder. In fact, if you use the constant values as shown above the compiler will likely optimize your code and produce the correct value of zero. However, if you provide the values at runtime, the compiler will need to generate code to calculate the remainder, which will result in a fault. Although this could be viewed as a compiler bug, the consensus of WG14 is that this should be explicitly documented as undefined behavior meaning that any compiler that exhibits this behavior can still be considered conforming.
Robert C. Seacord began programming (professionally) for IBM in 1982 and has been programming in C since 1985 and in C++ since 1992. Robert is currently a Senior Vulnerability Analyst with the CERT/Coordination Center at the Software Engineering Institute (SEI). His most recent book, The CERT C Secure Coding Standard, can be found at InformIT and bookstores everywhere and in Safari Books Online.
David Chisnall, author of The Definitive Guide to the Xen Hypervisor (also available in Safari Books Online) and numerous articles on InformIT, is a research assitant at Swansea University and a founding member and core developer of the Étoilé project, which aims to build an open source user environment based for desktop and mobile computing systems.