4.8 Discussion and Summary
Let us close this chapter by discussing the rationale behind some of the virtual memory choices of Linux/ia64.
First, there is the question of page size. The IA-64 architecture supports a large number of different page sizes including at least 4 Kbytes, 8 Kbytes, 16 Kbytes, 256 Kbytes, 1 Mbyte, 4 Mbytes, 16 Mbytes, 64 Mbytes, and 256 Mbytes. Linux uses a three-level page table, so the page size it uses directly affects the amount of virtual memory that a page table can map. For example, with a page size of 4 Kbytes, only 39 virtual address bits can be mapped. Thus, even though IA-64 can support pages as small as 4 Kbytes, from the perspective of Linux it is much better to pick a larger page size. Indeed, each time the page size is doubled, the amount of virtual memory that can be mapped increases 16 times! With a page size of 64 Kbytes, for example, 55 virtual address bits can be mapped.
Another consideration in choosing a page size is that programs should perform no worse on a 64-bit platform than on a 32-bit platform. Because pointers are twice as big on IA-64 as on a 32-bit platform, the data structures of a program may also be up to twice as big (on average, the data size expansion factor is much smaller: on the order of 2030 percent [47]). Given that IA-32 uses a page size of 4 Kbytes, this would suggest a page size of 8 Kbytes for IA-64. This size would guarantee that data structures that fit on a single page on IA-32 would also fit on a single page on IA-64. But we should also consider code size. Comparing the size of the text section of equivalent IA-32 and IA-64 programs, we find that IA-64 code is typically between two to four times larger. This would suggest a page size of 16 Kbytes.
The optimal page size depends heavily on both the machine and the applications in use. Taking all these factors into account, it is clear that any single page size cannot be optimal for all cases. Consequently, Linux/ia64 adopts the pragmatic solution of providing a kernel compile-time choice for the page size. In most cases, a choice of 8 Kbytes or 16 Kbytes would be reasonable, but under certain circumstances a page size as small as 4 Kbytes or as large as 64 Kbytes could be preferable. Of course, this solution implies that applications must not rely on Linux/ia64 implementing a particular page size. Fortunately, this is not a problem because the few applications that really do need to know the page size can obtain it by calling the getpagesize() library routine. Another approach to dealing with the page size issue is to put intelligence into the operating system to detect when a series of pages can be mapped by a single page of larger size, i.e., by a superpage. Superpage support can often mitigate some of the performance problems that occur when the page size is too small for a particular application. However, it does introduce additional complexity that could slow down all programs. More importantly, because superpages do not change the mapping granularity, they do not increase the amount of virtual memory that a page table can map.
Second, the choice with perhaps the most dramatic consequences is the structure of the address space that Linux/ia64 supports. The IA-64 architecture leaves the operating system designer almost complete freedom in this respect. For example, instead of the structure described in this chapter, Linux/ia64 could have implemented a linear address space whose size is determined by the amount of memory that can be mapped by a page table. This approach would have the disadvantage of placing everything in region zero. This would imply that the maximum distance by which different program segments can be separated is limited by the amount of virtual memory that can be mapped by the smallest supported page size (4 Kbytes). This is a surprisingly serious limitation, considering that a page size of 4 Kbytes limits virtual memory to just 239 bytes. In contrast, designers can exploit the region support in IA-64 and separate segments by 261 bytes, no matter what the page size.
A third choice that is closely related to the address-space structure is the format of the page-table entries. The short format is the most natural choice for Linux/ia64 because it makes possible the use of the VHPT walker by mapping the Linux page table into virtual space. On the downside, the short-format PTEs cannot take advantage of all the capabilities provided by IA-64. For example, with the long format, it would be possible to specify a separate protection key for each PTE, which in turn could enable more effective use of the TLB. Although Linux/ia64 directly maps its page table into the virtually-mapped linear page table, the IA-64 architecture does not require this to be the case. An alternative would be to operate the VHPT walker in the long-format mode and map it to a separate cache of recently used translations. This cache can be thought of as a CPU-external (in-memory) TLB. With this, Linux could continue to use a compact 8-byte long page-table entry and still take full advantage of operating the VHPT walker in long-format mode. Of course, the cost of doing this would be that the cache would take up additional memory and extra work would be required to keep the cache coherent with the page tables.
In summary, the IA-64 architecture provides tremendous flexibility in implementing the virtual memory system. The virtual memory design described in this chapter closely matches the needs of Linux, but at the same time it represents just one possible solution in a large design space. Because the machines that Linux/ia64 is running on and the applications that use it change over time, it is likely that the virtual memory system of Linux/ia64 will also change from time to time. While this will affect some of the implementation details, the fundamental principles described in this chapter should remain the same.