On the effectiveness of prefetching and reuse in reducing L1 data cache traffic

G. Surendra, Subhasis Banerjee, S. K. Nandy
2004 Proceedings of the 3rd workshop on Memory performance issues in conjunction with the 31st international symposium on computer architecture - WMPI '04  
Reducing the number of data cache accesses improves performance, port efficiency, bandwidth and motivates the use of single ported caches instead of complex and expensive multi-ported ones. In this paper we consider an intrusion detection system as a target application and study the effectiveness of two techniques -(i) prefetching data from the cache into local buffers in the processor core and (ii) load Instruction Reuse (IR) -in reducing data cache traffic. The analysis is carried out using a
more » ... microarchitecture and instruction set representative of a programmable processor with the aim of determining if the above techniques are viable for a programmable pattern matching engine found in many network processors. We find that IR is the most generic and efficient technique which reduces cache traffic by up to 60%. However, a combination of prefetching and IR with application specific tuning performs as well as and sometimes better than IR alone.
doi:10.1145/1054943.1054955 dblp:conf/wmpi/SurendraBN04 fatcat:7526wjzttng7hmtpllpfktwxtm