A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is application/pdf
.
Filters
Dynamically reducing pressure on the physical register file through simple register sharing
IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004
The first technique dynamically combines physical registers having the same value. ...
Despite the simplicity, our design reduces the required number of physical registers by more than 10% on some applications, and provides almost half of the total benefits of an aggressive (complex) scheme ...
Our approach is complementary to these approaches in that we reduce the demand of physical registers through sharing. ...
doi:10.1109/ispass.2004.1291358
dblp:conf/ispass/TranNNDH04
fatcat:qx7t5ugjurcnpe4ohpd4h43l5e
Exploring the limits of early register release
2009
ACM Transactions on Architecture and Code Optimization (TACO)
Register pressure in modern superscalar processors can be reduced by releasing registers early and by copying their contents to cheap back-up storage. ...
On the other hand, compilers have a global view of the program and, using simple dataflow analysis, can determine the last use. ...
By recycling physical registers much earlier than usual, register pressure is reduced. ...
doi:10.1145/1582710.1582714
fatcat:tjlrpgtys5bo7c7tw2j27ouvqa
Compiler directed early register release
2005
14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)
This paper presents a novel compiler directed technique to reduce the register pressure and power of the register file by releasing registers early. ...
Upon issuing an instruction with one of these logical registers as a source, the processor knows that there will be no more uses of it and can release the register through checkpointing. ...
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05) ...
doi:10.1109/pact.2005.14
dblp:conf/IEEEpact/JonesOAGE05
fatcat:wcy4bz3ribeznojxb47at26ela
Asymmetrically banked value-aware register files for low-energy and high-performance
2008
Microprocessors and microsystems
Our experimental evaluation with SPEC CINT2000 benchmark suite shows that AB-VARF reduces the energy consumption by 78.4% over a conventional register file, on the average, at the cost of a 0.7% performance ...
register file designs. ...
Other proposals try to reduce the required register file size for reduced access latency and energy consumption by delaying the physical register allocation [14, 26] , sharing physical registers to exploit ...
doi:10.1016/j.micpro.2007.10.004
fatcat:rw6allqxtrduzfqvh4yblqgune
Software-Directed Techniques for Improved GPU Register File Utilization
2018
ACM Transactions on Architecture and Code Optimization (TACO)
An in-depth evaluation on a large suite of applications shows that just our early register technique outperforms previous work on dynamic register allocation, and together these approaches, on average, ...
This article seeks to increase the thread occupancy and improve performance of these register-bound applications by making more efficient use of the existing register file capacity. ...
We also performed a sensitivity study on the size of the scalar register file, reducing it from 4KB to 3KB and 2KB. ...
doi:10.1145/3243905
fatcat:j4cejqjwcjerfat42nq5pop774
Efficient resources assignment schemes for clustered multithreaded processors
2008
Proceedings, International Parallel and Distributed Processing Symposium (IPDPS)
On the other hand, clustering architectures have been widely studied in order to reduce the inherent complexity of current monolithic processors. ...
On the one hand, exploiting instruction level parallelism is leading us to diminishing returns and therefore exploiting other sources of parallelism like thread level parallelism is needed in order to ...
Physical register file The other main shared resource where thread starvation occurs is the physical register file. ...
doi:10.1109/ipdps.2008.4536226
dblp:conf/ipps/LatorreGG08
fatcat:46m2rbhy3vehfed2pak2lw5ifa
Operand Registers and Explicit Operand Forwarding
2009
IEEE computer architecture letters
An evaluation shows that capturing operand bandwidth close to the function units allows operand registers to reduce the energy consumed in the register files and forwarding network of an embedded processor ...
Operand register files are small, inexpensive register files that are integrated with function units in the execute stage of the pipeline, effectively extending the pipeline operand registers into register ...
Furthermore, reference filtering by the operand registers reduces demand for operand bandwidth from the shared general-purpose registers, which allows the number of read ports to the general-purpose register ...
doi:10.1109/l-ca.2009.45
fatcat:xqx3yka73fgfvfonvsupg7ax3y
Evaluating the use of register queues in software pipelined loops
2001
IEEE transactions on computers
Using RQs, the compiler can allocate physical registers to store live values in the software pipelined loop while minimizing the pressure placed on architected registers. ...
Through the use of RQs, we can minimize the register pressure and code expansion caused by software pipelining. ...
The authors also thank Bob Rau and Alexandre Eichenberger for providing the loop kernels used in this study. ...
doi:10.1109/12.946998
fatcat:rsl6b7hforg5zbnxpkrsk2k34u
Evaluating the use of register queues in software pipelined loops
2001
IEEE transactions on computers
Using RQs, the compiler can allocate physical registers to store live values in the software pipelined loop while minimizing the pressure placed on architected registers. ...
Through the use of RQs, we can minimize the register pressure and code expansion caused by software pipelining. ...
The authors also thank Bob Rau and Alexandre Eichenberger for providing the loop kernels used in this study. ...
doi:10.1109/tc.2001.947006
fatcat:2db3qaphs5fmhfzo66flhxqyyi
Reducing register pressure in SMT processors through L2-miss-driven early register release
2008
ACM Transactions on Architecture and Code Optimization (TACO)
The register file is one of the most critical datapath components limiting the number of threads that can be supported on a simultaneous multithreading (SMT) processor. ...
To allow the use of smaller register files without degrading performance, techniques that maximize the efficiency of using registers through aggressive register allocation/deallocation can be considered ...
Finally, the third set of solutions reduces the number of registers through the use of register sharing [Balakrishan and Sohi 2003]. ...
doi:10.1145/1455650.1455652
fatcat:54a4w3qoufc5zfxqlwhr6ejgqu
A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors
2012
ACM Transactions on Computer Systems
register file hierarchy reduces register file energy by 54%. ...
Second, we propose replacing the monolithic register file found on modern designs with a hierarchical register file. ...
Acknowledgments We thank the anonymous reviewers and the members of the NVIDIA Architecture Research Group for their comments. ...
doi:10.1145/2166879.2166882
fatcat:cwh624dhdbbcffra6mr6kkorgu
NoSQ: Store-Load Communication without a Store Queue
2006
2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06)
The primary benefit of NoSQ is a simple, fast datapath that does not contain store-load forwarding hardware; all loads get their values either from the data cache or from the register file. ...
The primary benefit of NoSQ is a simple, fast datapath that does not contain store-load forwarding hardware; all loads get their values either from the data cache or from the register file. ...
Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf g721.e
Acknowledgments The authors thank their reviewers for their comments and suggestions for improving this ...
doi:10.1109/micro.2006.39
dblp:conf/micro/ShaMR06
fatcat:nedxknqu4fhbvimd3fq5aeomxm
NoSQ: Store-Load Communication without a Store Queue
2007
IEEE Micro
Moreover, SMB reduces register file pressure by allowing the definition and the load in a definition-store-load-use chain to share a single physical register. ...
Extending the commit pipeline might increase pressure on core structures such as the reorder buffer, load and store queues, and register file. ...
doi:10.1109/mm.2007.17
fatcat:sbzoag742nbozpqjioodroxzfe
Balancing register allocation across threads for a multithreaded network processor
2004
SIGPLAN notices
To reduce the register needs, move insertions are inserted at program points that split the live ranges or the nodes on the interference graph. ...
We first estimate the register requirement bounds, then reduce from the upper bound gradually to achieve a good register balance among threads. ...
The threads on one PU share the computation power of the PU and register files etc. Formally, the model is as follows: 1. ...
doi:10.1145/996893.996876
fatcat:5buakzxinje77fzqerf6sbktdi
Balancing register allocation across threads for a multithreaded network processor
2004
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation - PLDI '04
To reduce the register needs, move insertions are inserted at program points that split the live ranges or the nodes on the interference graph. ...
We first estimate the register requirement bounds, then reduce from the upper bound gradually to achieve a good register balance among threads. ...
The threads on one PU share the computation power of the PU and register files etc. Formally, the model is as follows: 1. ...
doi:10.1145/996841.996876
dblp:conf/pldi/ZhuangP04
fatcat:yau3ceqfojaynbszkonvddhfiq
« Previous
Showing results 1 — 15 out of 37,929 results