770 Hits in 3.8 sec

Dynamically exploiting narrow width operands to improve processor power and performance

D. Brooks, M. Martonosi
1999 Proceedings Fifth International Symposium on High-Performance Computer Architecture  
The second optimization improves performance by merging together narrow integer operations and allowing them to share a single functional unit.  ...  The first, power-oriented, optimization reduces processor power consumption by using aggressive clock gating to turn off portions of integer arithmetic units that will be unnecessary for narrow bitwidth  ...  Clark, and the reviewers for their helpful comments on early paper drafts. Research support includes funds from DARPA DABT63-97-C-1001, NSF MIP-97-08624, and a donation from Intel Corp.  ... 
doi:10.1109/hpca.1999.744314 dblp:conf/hpca/BrooksM99 fatcat:fzuye3dof5a3bltdsqjuponase


Somnath Paul, Swarup Bhunia
2010 Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design - ISLPED '10  
Compared to existing vari-cycle approach, the proposed scheme demonstrates a large improvement in yield (∼ 27% at highest performance bin) and profit (10-15%) for a set of benchmark applications.  ...  In this paper, we propose VAIL -a novel low-overhead instruction scheduling strategy that assigns best-case frequency by issuing the narrow-width (NW) operations to slower units.  ...  Different FU architectures which exploit NW operands for power/performance improvement has been widely researched [5] .  ... 
doi:10.1145/1840845.1840854 dblp:conf/islped/PaulB10 fatcat:qc6mogicgzh65juhjd6dd2mmqu

Speculative software management of datapath-width for energy optimization

Gilles Pokam, Olivier Rochecouste, André Seznec, François Bodin
2004 Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools - LCTES '04  
This paper evaluates managing the processor's datapathwidth at the compiler level by means of exploiting dynamic narrow-width operands.  ...  We capitalize on the large occurrence of these operands in multimedia programs to build static narrow-width regions that may be directly exposed to the compiler.  ...  It is therefore of importance to improve the compiler's effectiveness to manage both power and performance.  ... 
doi:10.1145/997163.997175 dblp:conf/lctrts/PokamRSB04 fatcat:ywf7dmx27vhabjizcbdnmvamte

Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance

David Brooks, Margaret Martonosi
2000 ACM Transactions on Computer Systems  
Our second, performance-oriented optimization improves processor performance by packing together narrow-width operations so that they share a single arithmetic unit.  ...  The first, power-oriented optimization reduces processor power consumption by using operand-value-based clock gating to turn off portions of arithmetic units that will be unused by narrow-width operations  ...  In order to augment compile-time analysis, we present two techniques to dynamically exploit narrow-width data.  ... 
doi:10.1145/350853.350856 fatcat:ztn256zwcvahnchyjskwuhyxw4

Width-Aware Fine-Grained Dynamic Supply Gating: A Design Methodology for Low-Power Datapath and Memory

Lei Wang, Somnath Paul, Swarup Bhunia
2012 2012 25th International Conference on VLSI Design  
The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to "supply-gate" unused parts of integer execution units and memory blocks while they are  ...  In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in datapath and embedded memory (e.g.  ...  We exploit this abundance of NW operands to develop a width-aware dynamic supply gating (WADSG) methodology which supply gates unused logic gates(cells) in datapath (memory).  ... 
doi:10.1109/vlsid.2012.94 dblp:conf/vlsid/WangPB12 fatcat:ri63lpxonvfmbdk7r4vzld5c2q

Dynamic Transfer of Computation to Processor Cache for Yield and Reliability Improvement

Somnath Paul, Swarup Bhunia
2011 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
We note that although the worst-case latency of memory based computation can be considerably higher than regular operation latency, the average latency is only modestly higher due to the abundance of narrow-width  ...  In this paper, we propose a novel memory-based computational framework that exploits the on-chip memory to perform computation on demand using a lookup table (LUT)-based approach.  ...  Priority Scheduling Exploiting the abundance of narrow width operands, one can further improve the performance overhead when MBC is used for defect tolerance.  ... 
doi:10.1109/tvlsi.2010.2049389 fatcat:p37vgcte2jhzhdi33swanugov4


2010 Journal of Circuits, Systems and Computers  
In this paper, we propose two techniques to reduce the energy dissipation of the issue queue by exploiting the immediate operand¯les of the stored instructions:¯rstly by storing immediate operands in separate  ...  immediate operand¯les rather than storing them inside the issue queue entries and secondly by issue queue partitioning based on widths of immediate operands of instructions.  ...  Issue queue is a component that allows out-of-order execution and improves the performance of the processor.  ... 
doi:10.1142/s0218126610006992 fatcat:4t6lfglnrbcs7oxw6err7uyxuq

On the Exploitation of Narrow-Width Values for Improving Register File Reliability

Jie Hu, Shuai Wang, S.G. Ziavras
2009 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
In this paper, we propose to exploit narrow-width register values, which present the majority of the generated values, for making a duplicate of the value within the same data item; this in-register duplication  ...  Protecting the register value and its data buses is crucial to reliable computing in high-performance microprocessors due to the increasing susceptibility of CMOS circuitry to soft errors induced by high-energy  ...  BASICS OF REGISTER RENAMING AND NARROW-WIDTH REGISTER VALUES Superscalar microprocessors dynamically exploit instruction-level parallelism (ILP) to issue multiple instructions per cycle for improved performance  ... 
doi:10.1109/tvlsi.2009.2017441 fatcat:ryrga5la2nfplaaafqpvrgf4te

An Overview of Architecture-Level Power- and Energy-Efficient Design Techniques [chapter]

Ivan Ratković, Nikola Bežanić, Osman S. Ünsal, Adrian Cristal, Veljko Milutinović
2015 Advances in Computers  
Both computer architects and circuit designers intent to reduce power and energy (without a performance Advances in Computers, Volume 98 # 2015 Elsevier Inc.  ...  Power dissipation and energy consumption became the primary design constraint for almost all computer systems in the last 15 years.  ...  Nagle [48] "Dynamically exploiting narrow width operands to improve processor power and performance," D. Brooks and M.  ... 
doi:10.1016/bs.adcom.2015.04.001 fatcat:5voowf7sizcpxaumb74nelc25a

Prioritizing verification via value-based correctness criticality

Joonhyuk Yoo, Manoj Franklin
2007 2007 25th International Conference on Computer Design  
However, such a full re-execution significantly increases the demand on the processor resources, resulting in severe performance degradation.  ...  A likelihood of correctness criticality is computed by a value vulnerability factor, which is defined by the numerically significant bit-width used to compute a result.  ...  However, in the case that AVF is equal to zero, MITF goes to infinity and does not capture the performance improvement.  ... 
doi:10.1109/iccd.2007.4601921 dblp:conf/iccd/YooF07 fatcat:mbedrvsu6rb7hoeahvzucj6jb4

Empowering a helper cluster through data-width aware instruction selection policies

O.S. Unsal, O. Ergin, X. Vera, A. Gonzalez
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
On the other hand, clustering mechanisms enable cost-and performance-effective scaling of processor back-end features.  ...  Those attributes can be combined synergistically to design special clusters operating on narrow values (a.k.a. Helper Cluster), potentially providing performance benefits.  ...  In [14] and [20] , narrow width operands were exploited to reduce the power requirements of a value predictor.  ... 
doi:10.1109/ipdps.2006.1639350 dblp:conf/ipps/UnsalEVG06 fatcat:os6xaqbcyvgqhnxdqbkzwyjk24

Tag simplification: Achieving power efficiency through reducing the complexity of the wakeup logic

Mehmet Burak Aykenar, Muhammet Ozgur, Vehbi Esref Bayraktar, Oguz Ergin
2011 2011 International Conference on Energy Aware Computing  
Our design reduces the dynamic energy dissipation of the CAM array inside the issue queue 15% with virtually no impact on performance.  ...  Contemporary microprocessor cores employ out-of-order execution in order to boost performance.  ...  ACKNOWLEDGMENT This work was partially supported by the Scientific and Technological Research Council of Turkey (TUBITAK) through the research grant 109E043.  ... 
doi:10.1109/iceac.2011.6136701 dblp:conf/iceac/AykenarOBE11 fatcat:eqgkyqcqlbdphesjiyshubpm24

Exploiting residue number system for power-efficient digital signal processing in embedded processors

Rooju Chokshi, Krzysztof S. Berezowski, Aviral Shrivastava, Stanislaw J. Piestrak
2009 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems - CASES '09  
Our experiments not only demonstrate simultaneous improvement of up to 30% in performance and 57% reduction in functional unit power consumption, but also that most of these benefits can be exploited with  ...  2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propagation.  ...  In this work we address the challenge of exploiting the power and performance benefits of RNS arithmetic in a RISC processor in a multi-tier manner.  ... 
doi:10.1145/1629395.1629401 dblp:conf/cases/ChokshiBSP09 fatcat:5cksqwuexzdhbkn3wchujjscpm

Asymmetrically banked value-aware register files for low-energy and high-performance

Shuai Wang, Hongyan Yang, Jie Hu, Sotirios G. Ziavras
2008 Microprocessors and microsystems  
In this paper, we propose a new microarchitecture, the asymmetrically banked value-aware register file (AB-VARF), to exploit the prevailing narrow-width register values for low-latency and energy-efficient  ...  Augmented with a value width predictor, the register renaming logic is slightly tuned to rename predicted narrow-width registers to the corresponding narrow-width banks.  ...  Register files and register renaming Superscalar microprocessors dynamically exploit instruction level parallelism (ILP) to issue multiple instructions per cycle for improved performance.  ... 
doi:10.1016/j.micpro.2007.10.004 fatcat:rw6allqxtrduzfqvh4yblqgune

Reliability improvement in multicore architectures through computing in embedded memory

Hadi Hajimiri, Somnath Paul, Anandaroop Ghosh, Swarup Bhunia, Prabhat Mishra
2011 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS)  
The private as weU as shared caches are used to perform computation on demand using a lookup table.  ...  Experimental results demonstrate that on-demand memory based computing can significantly improve reliability with minor loss in performance.  ...  the number of memory accesses for the generation and addition of partial products in case of narrow width operands.  ... 
doi:10.1109/mwscas.2011.6026672 fatcat:pgzjfeh6zfb7vo4r5fceboytt4
« Previous Showing results 1 — 15 out of 770 results