Profile-driven code execution for low power dissipation

D. Marculescu
2000 ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514)  
This paper proposes a novel technique for powerperformance trade-off based on a profile-driven code execution methodology. Specifically, we show that there is an optimal level of parallelism for energy consumption and propose a compiler-assisted technique for code annotation that can be used at run-time to adaptively trade-off power and performance. As shown by experimental results, our approach is up to 23% better than clock throttling and is as efficient as voltage scaling (up to 10% better
more » ... some cases). The technique proposed in this paper can be used by an ACPI-compliant power manager for prolonging battery life or as a passive cooling feature for thermal management. Introduction Power dissipation has become a critical design concern in recent years, driven by the increased levels of complexity and emergence of mobile applications. While it is generally agreed that tools for power estimation and optimization do exist for hardware specifications at different levels (circuit, gate, register-transfer or behavioral), more work is needed in the area of power analysis or optimization at microarchitecture, architecture or system level [1]. Having tools that are able to quantify the effect of different performance or power optimization schemes for a piece of code running on a given processor is of extreme importance for computer architects and compiler engineers who can characterize different architecture styles not only in terms of their performance, but also in terms of the corresponding energy efficiency. In the area of power modeling for embedded software, [2] proposes a per-instruction base power model that can be used to find an aggregate power estimate for a sequence of instructions. In [3], the case of DSP applications is addressed. There, the inter-instruction effects turn out to be significant, thus making possible to develop instruction scheduling techniques that target power minimization. The authors of [4] present an architectural enhancement to reduce the extra work or energy due to mispredicted branches, without significant loss in performance. In [5] a technique for reducing the average power consumption for the pipeline structure is presented. Other approaches target techniques for energy efficient memory systems [6, 7] . From a different perspective, the aspect of thermal management has been addressed in [8] where a hardwaredriven technique for instruction cache throttling has been proposed. In this paper we address the problem of energy optimization in modern processors by using compiler-assisted code annotation for variable fetch or execution rate. We improve the state-of-the-art by proposing a novel technique for fine-grain energy characterization based on a profile-driven code execution methodology. Specifically, we show analytically and experimentally that there exists an optimal level of parallelism for energy consumption (which may not be necessarily the same as the one for performance) and propose a compiler-assisted technique for code annotation that adaptively selects at run-time the optimal number of instructions to be fetched or executed in parallel as far as energy is concerned. Energy, as opposed to performance, is a much more datadependent parameter. As it will be shown subsequently, it is indeed possible to use less than the maximum number of functional units available, and achieve less energy consumption. We study this effect for the execution stage, as well as for the entire processor. For the first time to our knowledge, we show that there exists an inherent trade-off between performance and energy consumption, due to the datadependency effect, but, most importantly, due to speculative execution and the inherent level of parallelism exhibited by common applications. To validate our results, we use a microarchitecture-level power simulator developed in industry [9]. As shown subsequently, significant savings can be obtained in both energy and power consumption, at the expense of some decrease in performance. The techniques described in this paper can be used as a means for prolonging the battery life, but most importantly, for thermal management [10] by achieving significant average power reductions in
doi:10.1109/lpe.2000.155294 fatcat:t6d2yxetzbfu7bzdj4x75wo5ku