Compiler orchestrated prefetching via speculation and predication

Rodric M. Rabbah, Hariharan Sandanagobalane, Mongkol Ekpanyapong, Weng-Fai Wong
2004 Proceedings of the 11th international conference on Architectural support for programming languages and operating systems - ASPLOS-XI  
This paper introduces a compiler-orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program dependence graph that succinctly characterize the memory access pattern of both regular array-based applications and irregular pointer-intensive programs. We illustrate how program embedded precomputation via speculative execution can accurately
more » ... edict and effectively prefetch future memory references with negligible overhead. The proposed techniques reduce the total running time of seven SPEC benchmarks and two OLDEN benchmarks by 27% on an Itanium 2 processor. The improvements are in addition to several state-of-the-art optimizations including software pipelining and data prefetching. In addition, we use cycle-accurate simulations to identify important and lightweight architectural innovations that further mitigate the memory system bottleneck. In particular, we focus on the notoriously challenging class of pointerchasing applications, and demonstrate how they may benefit from a novel scheme of sentineled prefetching. Our results for twelve SPEC benchmarks demonstrate that 45% of the processor stalls that are caused by the memory system are avoidable. The techniques in this paper can effectively mask long memory latencies with little instruction overhead, and can readily contribute to the performance of processors today.
doi:10.1145/1024393.1024416 dblp:conf/asplos/RabbahSEW04 fatcat:lfuxavzxijgqhgs7nsggfrqq3u