- Virtual Virtualization
- Everybody Must Play Nicely
- Latency Versus Throughput
- Some Are More Equal Than Others
Latency Versus Throughput
Cooperative multitasking survived for so long because it provided good throughput. Performing a context switch is a relatively expensive operation. On something like Power PC or SPARC, it’s not much more expensive than a function call, but on x86 it’s much more expensive, due to the design of the memory management unit.
This fact gives us the first set of conflicting requirements. For best throughput, you should let a process run for as long as it can, and then switch to the next process. Cooperative multitasking is the logical conclusion of this approach, which means that you incur the context switching overhead only when you actually need to run a new process.
As I mentioned earlier, though, the problem is that the system becomes unresponsive. For activities like video and audio playback, this is a huge problem. A system that does cooperative multitasking fails completely at these tasks. Smooth playback of a video stream requires that the player get access to the CPU at regular intervals in order to decode the next frame and push it to the screen. A busy process can block this access, and cause frames to be dropped.
A good scheduler has to balance throughput and latency needs. A throughput-based scheduler is better for a lot of server workloads. For something like a web server, the time spent responding to a request is bounded by network latency, so losing a few tens of milliseconds’ latency to the scheduler won’t impact the user experience, while losing some throughput reduces the number of clients that can be handled at once. For a desktop workload, latency is a lot more important.
The FreeBSD ULE scheduler attempts to address this issue by dividing processes into two categories: "I/O bound" and "CPU bound." Most interactive processes spend most of their time waiting for the user to do something, so they fall into the first category. On the other hand, most long-running daemons consume as much CPU as you can throw at them while they’re running, but if they take a few seconds or minutes longer to complete—well, no one cares. ULE prioritizes processes that are I/O bound. If a process is spending most of its time waiting for some kind of input, it gets to monopolize the CPU briefly when that input arrives.
Part of the problem with latency scheduling is that it’s difficult for the scheduler to get the information it needs. Back to the video player example: The decoder has fairly strict formal requirements—it needs to decode the next frame before playing it. But the scheduler often has no way of knowing when the next frame is due, or when the decoder has finished decoding it. Without a userspace interface to provide this kind of information, it’s very difficult for the scheduler to make informed decisions. Instead, it needs to rely on very coarse-grained information, like priority class.