Heterogeneous Cores
If you’re using only a small number of the cores on your processor, you might start to wonder why they even exist. The few applications that need all of the cores turned on could run faster on dedicated hardware, so why don’t you use that instead?
We’re starting to see this trend already. Examples include Apple’s Core Video; it will run on your CPU if it needs to, the CPU’s vector unit if it has one, or the GPU if that would be faster. OpenSSL will run on a cryptographic card if one exists, or fall back to the CPU if not. The existence of general-purpose abstract interfaces to this kind of functionality makes it much easier to implement in hardware; only a very small change is required to take advantage of the functionality. We saw the same thing with OpenGL; moving transform and lighting calculations onto the graphics hardware required new drivers to be written, but no modification to existing application code. Most importantly, since dedicated silicon is more efficient than general-purpose hardware for its intended task, the power usage is likely to be lower.
If you have silicon to spare, why not add a GPU onto the die? A cryptographic accelerator? Dedicated hardware for other computationally-expensive algorithms? When they’re not in use, you could turn them off. When they’re needed, they still draw less power than running the same algorithms on general-purpose hardware. The first step here was the integration of FPUs and then SIMD units. The next step will likely be the integration of a GPU on-die. Beyond that, it’s likely to be a matter of which algorithms will benefit the most from dedicated hardware. In some cases, we’ll simply see extensions to the basic instruction set (as happened with floating-point and SIMD instructions) to provide operations that will help a few categories of algorithms. Eventually, we’re likely to see these evolve beyond simple instructions.
One idea I’ve seen that seems quite appealing is to put a field-programmable gate array on-die. This would allow a lot of flexibility in operation, but is likely to come with a significant power cost.