Hiding Latency With Prefetch
Fetching data from main memory is generally costly. Prefetch is an interesting technique used to avoid or reduce the time the processor is waiting for data to arrive in the registers.
With prefetch, data (or instructions) is moved closer to the CPU prior to usage. Hopefully, it is then available by the time the processor needs it. Even if it has not arrived yet, it will help in reducing the processor wait, or stall, time.
FIGURE 4 shows this graphically. Note that this is a simplification, as it suggests that the time spent in the processor equals that of the memory reference time. This need not be the case, of course.
FIGURE 4 Prefetch
Thanks to prefetch, the memory latency can be hidden to a certain extent.
Although conceptually quite simple, prefetch is not always easy to implement efficiently in an application:
To hide the latency, the processor must perform sufficient other activities to allow time for the actual prefetching to occur. These activities may not be present in the application, or there are not enough other resources (for example, registers) available while the prefetch operation is in progress.
Predicting where in the memory hierarchy the data resides is difficult. Usually the location of the data is not constant while an application is executing. The question is then where to prefetch from as the location in the memory hierarchy dictates the time required for the prefetch to occur.
Prefetch is an instruction that initiates the movement of data from memory towards the processor. There are several key aspects to consider when using prefetch:
Selecting which accesses are likely to miss in the cache and so should/could use prefetch.
Selecting where to place the prefetch instruction (prefetch must be executed sufficiently early).
Knowing the memory address for use in the inserted prefetch instruction.
Despite these potential drawbacks, prefetch is a powerful technique to improve application performance, and is worth considering as part of tuning the performance of an application.