The Tag Filter Architecture: An energy-efficient cache and directory design

Joan J. Valls, Alberto Ros, María E. Gómez, Julio Sahuquillo
2017 Journal of Parallel and Distributed Computing  
Power consumption in current high-performance chip multiprocessors (CMPs) has become a major design concern that aggravates with the current trend of increasing the core count. A significant fraction of the total power budget is consumed by on-chip caches which are usually deployed with a high associativity degree (even L1 caches are being implemented with eight ways) to enhance the system performance. On a cache access, each way in the corresponding set is accessed in parallel, which is costly
more » ... in terms of energy. On the other hand, coherence protocols also must implement efficient directory caches that scale in terms of power consumption. Most of the state-ofthe-art techniques that reduce the energy consumption of directories are at the cost of performance, which may become unacceptable for high-performance CMPs. In this paper we propose an energy-efficient architectural design that can be effectively applied to any kind of cache memories. The proposed approach, called the Tag Filter (TF) Architecture, filters the ways accessed in the target cache set, and just a few ways are searched in the tag and data arrays. This allows to reduce the dynamic energy consumption of caches without hurting their access time. For this purpose, the proposed architecture holds the X least significant bits of each tag in a small auxiliary X-bit-wide array. These bits are used to filter the ways where the least significant bits of the tag do not match with the bits in the X-bit array. Experimental results show that, on average, the TF architecture reduces the dynamic power consumption across the Email addresses: joavalmo@fiv.upv.es (Joan J. Valls), aros@ditec.um.es (Alberto Ros), megomez@disca.upv.es (María E. Gómez), jsahuqui@disca.upv.es (Julio Sahuquillo) 1 Corresponding author studied applications up to 74.9%, 85.9%, and 84.5% when applied to L1 caches, L2 caches, and directory caches, respectively.
doi:10.1016/j.jpdc.2016.04.016 fatcat:6eeu7a3bvbhu5mjh4himb26bjy