The Bright Parts of x86
A while ago, I wrote about the dark corners of the x86 architecture. Because it has so many kludges, it's easy to overlook the fact that this architecture also has a number of quite nice features, many of which were abandoned by AMD for x86-64. In this article, we'll take a look at some of the more interesting features and consider why they haven't seen wider use.
In many cases, C is to blame. The first low-level programming language I learned was PL/M, a language that exposed all the features of the native instruction set to the programmer. One reason that PL/M never caught on to the degree of C and its descendants was that it was very hard to write portable code in the language. There was a portable subset, but the real power came from using the system-specific parts.
In contrast, C exposed a model of the hardware that looked like a PDP-11. This design made it impossible to use any of the more advanced features directly from C, but also meant that code written in C could run on any architecture.
Call Gates
A recent x86 chip offers several ways of issuing system calls. The traditional method, used by DOS and inherited by most subsequent operating systems, is to issue interrupt 80h, which jumps to the operating system's interrupt service vector, which then jumps to the system call handler. The call handler looks at the value of the first register and makes the corresponding system call. Newer chips use the SYSCALL or SYSENTER instruction, which jumps directly to the correct system call handler.
One other way of issuing system calls isn't widely used. From the perspective of a programmer using a high-level language, a system call looks rather like a function call, albeit one handled by the kernel rather than by a library. The protected memory design of the 386 was intended to expose this abstraction to low-level programmers as well. When you issue a call into another segment, the destination may be a normal function, or it might be a call gate. These special landing pads change the privilege level when they are entered, and they restore it when they leave. Issuing a system call is just a matter of calling the correct system call handler via a call gate.
This mechanism was intended to work with one of the other unusual features of the architecture: It had four protection modes (rings), rather than just two. You could run the core kernel in ring 0, which has total privilege, and then device drivers in ring 1, operating system services in ring 2, and finally run userspace applications out in ring 3. Each ring could access all of the memory of things in less-privileged rings, but was restricted from touching anything in a lower-numbered ring.
The call gate mechanism allowed the kernel to call down to device drivers, system services to call up to the kernel, and so on. Unfortunately, portable operating systems used only the two protection modes that were available on most architectures. Because call gates were rarely used, nobody made an effort to make them fast, which discouraged people from using them, and so on.