Software-directed power-aware interconnection networks
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems - CASES '05
Interconnection networks have been deployed as the communication fabric in a wide spectrum of parallel computer systems, ranging from chip multiprocessors (CMPs) and embedded multicore systems-on-a-chip (SoCs) to clusters and server blades. Recent technology trends have permitted a rapid growth of chip resources, faster clock rates, and wider communication bandwidths, however, these trends have also led to an increase in power consumption that is becoming a key limiting factor in the design of
... r in the design of such scalable interconnected systems. Power-aware networks, therefore, need to become inherent components of single and multi-chip parallel systems. In the hardware arena, recent interconnection network power-management research work has employed limitedscope techniques that mostly focus on reducing the power consumed by the network communication links. As these limited-scope techniques are not tailored to the applications running on the network, power savings and the corresponding impact on network latency vary significantly from one application to the next as we demonstrate in this paper; in many cases, network performance can severely suffer. In the software arena, extensive research on compile-time optimizations has produced parallelizing compilers that can efficiently map an application onto hardware for high performance. However, research into power-aware parallelizing compilers is in its infancy. In this paper, we take the first steps toward tailoring applications' communication needs at run-time for low power. We propose software techniques that extend the flow of a parallelizing Extension of Conference Paper. Original work appeared in CASES'05 [Soteriou et al. 2005]. The extensions found in the journal paper submission are the following: -Updated Related Work in Section 2.1. -Addition of method for generating software directives under adaptive routing, along with a detailed mathematical model, Sections 4.1 and 4.1.1. -Characterization and rationale for the use of buffer utilizations at run-time in concert with software directives, Section 5.1. -Explanation of how buffer utilization is used in adaptive routing Section 5.2. -Additional results with adaptive routing in Section 6 for all 28 applications of the three architectures considered. -Full range of power-performance results (all 28 applications) under traffic perturbation in Section 6.6, along with updated relevant discussion. The original conference paper only had a subset of these results. V. Soteriou et al. compiler in order to direct run-time network power reduction. We target network links, a significant power consumer in these systems, allowing dynamic voltage scaling (DVS) instructions extracted during static compilation to orchestrate link voltage and frequency transitions for power savings during application run-time. Concurrently, an online hardware mechanism measures network congestion levels and adapts these off-line DVS settings to maximize network performance. Our simulations over three existing parallel systems, ranging from very fine-grained single-chip to coarse-grained multi-chip architectures, show that link power consumption can be reduced by up to 76.3%, with a minor increase in latency, ranging from 0.18 to 6.78% across a number of benchmark suites.