Coupling compiler-enabled and conventional memory accessing for energy efficiency

Raksit Ashok, Saurabh Chheda, Csaba Andras Moritz
2004 ACM Transactions on Computer Systems  
This paper presents Cool-Mem, a family of memory system architectures that integrate conventional memory system mechanisms, energy-aware address translation, and compiler-enabled cache disambiguation techniques, to reduce energy consumption in general purpose architectures. The solutions provided in this paper leverage on inter-layer tradeoffs between architecture, compiler, and operating system layers. Cool-Mem achieves power reduction by statically matching memory operations with
more » ... s with energy-efficient cache and virtual memory access mechanisms. It combines statically speculative cache access modes, a dynamic CAM based Tag-Cache used as backup for statically mispredicted accesses, different conventional multi-level associative cache organizations, embedded protection checking along all cache access mechanisms, as well as architectural organizations to reduce the power consumed by address translation in virtual memory. Because it is based on speculative static information, a superset of the predictable program information available at compile-time, our approach removes the burden of provable correctness in compiler analysis passes that extract static information. This makes Cool-Mem highly practical, applicable for large and complex applications, without having any limitations due to complexity issues in our compiler passes or the presence of precompiled static libraries. Based on extensive evaluation, for both SPEC2000 and Mediabench applications, we obtain from 6% to 19% total energy savings in the processor, with performance ranging from 1.5% degradation to 6% improvement, for the applications studied. We have also compared Cool-Mem to several prior arts and have found Cool-Mem to perform better in almost all cases. In contrast, this paper presents Cool-Mem, a family of memory system architectures, that is enabled by speculative static compile-time information. Cool-Mem integrates conventional memory access mechanism with compiler enabled techniques and energy-aware address translation, to reduce energy consumption, further blurring the interface between compiler and architecture. Our experimental results confirm our intuition, that combined compiler-architecture based designs open up smart new ways to reduce power consumption and in many cases even improve application performance. The issues raised and solutions provided in this paper leverage inter-layer tradeoffs in memory systems, clearly affecting architecture, compiler, and even operating system layers. But how can we benefit from static information? Cool-Mem uses static program information about memory access types and patterns, to reduce some of the redundancy in conventional memory access mechanisms. This redundancy in current memory system architectures is due to the general one-size-fits-all design philosophy, where all memory accesses are treated equal, i.e., having one single dynamic approach for all situations. For example, each memory operation typically requires a TLB access for virtual-to-physical address translation or for protection checking, and every single associative cache access requires associative lookup of multiple tags and cache blocks for one single word returned. As we will show in this paper, a large fraction of this redundancy can actually be eliminated, resulting in significant power and energy savings. Cool-Mem architectural components include: (1) support for statically speculative cache access modes, (2) a dynamic CAM based Tag-Cache used as backup for statically mispredicted accesses, (3) a conventional multi-level associative cache organization, (4) embedded protection checking along all cache access mechanisms, and (5) a variety of techniques (this because we study a number of different organizations, each with advantages and disadvantages) in supporting power-aware address translation in virtual memory architectures. Physically tagged and indexed caches require that cache indexing be overlapped with address translation for performance reasons. As translation is overlapped with the actual cache access, only the low-order offset bits (that do not change with the translation) are available for cache indexing. With the growing on-chip cache sizes, this is becoming more and more difficult, leaving virtually tagged and indexed caches as a practical alternative [Patterson and Hennessy 1990] . Cool-Mem employs a virtually tagged and indexed cache as a first level cache and combines it with either a virtual or a physical second level cache design, with protection checks integrated along all cache access paths, including the compiler directed static access path, and moves address translation to upper layers in the
doi:10.1145/986533.986535 fatcat:se3kcxe6erheled52qzdrmbelu