3.8 Translation Lookaside Buffer (TLB)
Initially, when the processor needs to map a virtual address to a physical address, it must traverse the full page directory searching for the PTE of interest. This would normally imply that each assembly instruction that references memory actually requires several separate memory references for the page table traversal [Tan01]. To avoid this considerable overhead, architectures take advantage of the fact that most processes exhibit a locality of reference, or, in other words, large numbers of memory references tend to be for a small number of pages. They take advantage of this reference locality by providing a Translation Lookaside Buffer (TLB), which is a small associative memory that caches virtual to physical page table resolutions.
Linux assumes that most architectures support some type of TLB, although the architecture-independent code does not care how it works. Instead, architecture-dependent hooks are dispersed throughout the VM code at points where it is known that some hardware with a TLB would need to perform a TLB-related operation. For example, when the page tables have been updated, such as after a page fault has completed, the processor may need to update the TLB for that virtual address mapping.
Not all architectures require these type of operations, but, because some do, the hooks have to exist. If the architecture does not require the operation to be performed, the function for that TLB operation will be a null operation that is optimized out at compile time.
A quite large list of TLB API hooks, most of which are declared in <asm/pgtable.h>, are listed in Tables 3.2 and 3.3, and the APIs are quite well documented in the kernel source by Documentation/cachetlb.txt [Mi100]. It is possible to have just one TLB flush function, but, because both TLB flushes and TLB refills are very expensive operations, unnecessary TLB flushes should be avoided if at all possible. For example, when context switching, Linux will avoid loading new page tables using Lazy TLB Flushing, discussed further in Section 4.3.
Table 3.2. Translation Lookaside Buffer Flush API
void flush_tlb_all(void) |
This flushes the entire TLB on all processors running in the system, which makes it the most expensive TLB flush operation. After it completes, all modifications to the page tables will be visible globally. This is required after the kernel page tables, which are global in nature, have been modified, such as after vfree() (See Chapter 7) completes or after the PKMap is flushed (see Chapter 9). |
void flush_tlb_mm(struct mm_struct *mm) |
This flushes all TLB entries related to the userspace portion (i.e., below PAGE_OFFSET) for the requested mm context. In some architectures, such as MIPS, this will need to be performed for all processors, but usually it is confined to the local processor. This is only called when an operation has been performed that affects the entire address space, such as after all the address mapping has been duplicated with dup_mmap() for fork or after all memory mappings have been deleted with exit_mmap(). |
void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end) |
As the name indicates, this flushes all entries within the requested user space range for the mm context. This is used after a new region has been moved or changed as during mremap(), which moves regions, or mprotect(), which changes the permissions. The function is also indirectly used during unmapping a region with munmap(), which calls tlb_finish_mmu(), which tries to use flush_tlb_range() intelligently. This API is provided for architectures that can remove ranges of TLB entries quickly rather than iterating with flush_tlb_page(). |
Table 3.3. Translation Lookaside Buffer Flush API (cont.)
void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr) |
Predictably, this API is responsible for flushing a single page from the TLB. The two most common uses of it are for flushing the TLB after a page has been faulted in or has been paged out. |
void flush_tlb_pgtables(struct mm_struct *mm, unsigned long start, unsigned long end) |
This API is called when the page tables are being torn down and freed. Some platforms cache the lowest level of the page table, i.e., the actual page frame storing entries, which needs to be flushed when the pages are being deleted. This is called when a region is being unmapped and the page directory entries are being reclaimed. |
void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, pte_t pte) |
This API is only called after a page fault completes. It tells the architecture-dependent code that a new translation now exists at pte for the virtual address addr. Each architecture decides how this information should be used. For example, Sparc64 uses the information to decide if the local CPU needs to flush its data cache or does it need to send an Inter Processor Interrupt (IPI) to a remote processor. |