- What Are You Hiding?
- Impedance Mismatch
- Unifying Abstractions
- Is Your Abstraction Recursive?
- Think Twice, Code Once
Unifying Abstractions
If you've studied CPU architectures or operating systems, you're probably familiar with the two forms of memory protection that were developed in the last century: pages and segments. Pages are a fixed size, which makes them very convenient for operating systems that want to implement virtual memory. You can trivially map pages to some space on the disk and swap them out. Pages also don't suffer from fragmentation, because you can always take one page from one process and give it to another, without worrying that they're the wrong size.
An operating systems text will tell you that pages won because they were betterthat is, unless you consider the end user. From someone sitting in userspace, segments are a much more useful abstraction, which you'll have discovered if you ever used the mprotect() function on a POSIX system. This function takes a pointer and a size and lets you change the memory protection for a variable. This is convenient because you can mark a variable as being read-only, for example, and the hardware enforces that setting.
That's the theory, anyway. Unfortunately, on most modern hardware, mprotect() is implemented in terms of paging. It doesn't change the memory protection of the memory region that you select; instead, it changes the protection of the pages containing the region. If you're lucky, these pages are 4KB, but on some systems they can be 8MB or more. You need to allocate an entire page for a single variable if you want to be able to use mprotect() on it, so you may end up with vast amounts of wasted space.
Of course, if you had segments, this problem would be trivial. Segments can be any size, so mprotect() would just set the permissions on a segment containing exactly the right permissions.
So what's the problem here? The answer is quite simple: You're using the same abstraction for two different purposes. Virtual memory involves two completely unrelated activities. The first is defining a mapping from virtual memory addresses to physical memory addresses (or to some not-present flag that allows it to be lazily allocated or fetched from disk). The second is defining protections on regions of memory. Pages are a good solution for the first activity, segments for the second.
In 2002, some guys at MIT proposed a solution to this problem, which they called Mondrian memory protection. This solution completely separated the protection and translation aspects, optionally using paging for translation, and using a dense, variable-sized set of permission ranges. So far, this capability hasn't been found in any modern CPUs.
Every time you see a report of a buffer overflow vulnerability, or a vulnerability that comes from writing executable code into some application memory, think about this: If CPU designers had used two abstractions instead of one for virtual memory back in the 1970s, none of these exploits would have been possible.
As an aside, x86 actually allows you to use segments for protection and pages for translation. Unfortunately, it restricts you to 8192 segments per process, which isn't enough to be useful. Rather than extending this number, AMD entirely removed the segmentation mechanism from x86[nd]64.