Filters








3,105 Hits in 2.3 sec

Lightweight predication support for out of order processors

Mark Stephenson, Lixin Zhang, Ram Rangan
2009 2009 IEEE 15th International Symposium on High Performance Computer Architecture  
The benefits of Out of Order (OOO) processing are well known, as is the effectiveness of predicated execution for unpredictable control flow.  ...  For instance, the only form of predication supported by modern OOO processors is a simple conditional move.  ...  We are also grateful for the support that we received from the members of the Novel Systems Architecture group and the Performance and Tools group at IBM's Austin Research Laboratory.  ... 
doi:10.1109/hpca.2009.4798255 dblp:conf/hpca/StephensonZR09 fatcat:exrb6ivuknhyzitwidr666bt4e

Scaling to the end of silicon with EDGE architectures

D. Burger, S.W. Keckler, K.S. McKinley, M. Dahlin, L.K. John, C. Lin, C.R. Moore, J. Burrill, R.G. McDonald, W. Yoder
2004 Computer  
decade, scaling to new levels of power efficiency and high performance.  ...  The TRIPS architecture is the first instantiation of an EDGE instruction set, a new, post-RISC class of instruction set architectures intended to match semiconductor technology evolution over the next  ...  Acknowledgments We thank the following student members of the  ... 
doi:10.1109/mc.2004.65 fatcat:kvdia4bm2velfnlle57bt6qcpy

Efficient Lightweight Compression Alongside Fast Scans

Orestis Polychroniou, Kenneth A. Ross
2015 Proceedings of the 11th International Workshop on Data Management on New Hardware - DaMoN'15  
00 unused extra bit per code unused high order bits per word Works with simple order predicates: <,=,> ❖ Boolean result in overflow bit of b-bit arithmetic ❖ Executing < O(n) operations select … where  ...  Either constant across the entire inputOr constant for the next group of items (e.g. frame-of-reference) Lightweight Compression ❖ Compression schemes ❖ Entropy compression ❖ Group nearby similar  ... 
doi:10.1145/2771937.2771943 dblp:conf/damon/PolychroniouR15 fatcat:cepziv64bzcytkcidayqkv5cj4

Dynamic vectorization in the E2 dynamic multicore architecture

Andrew Putnam, Aaron Smith, Doug Burger
2011 SIGARCH Computer Architecture News  
We provide details of E2's support for dynamic reconfigurability and show how the EDGE ISA facilities outof-order vector execution.  ...  In this paper we describe the preliminary design of a new dynamic multicore processor called E2 that utilizes an EDGE ISA to allow for the dynamic composition of physical cores into logical processors.  ...  Unlike previous in-order vector machines, E2 allows for out-of-order execution of both vectors and scalars.  ... 
doi:10.1145/1926367.1926373 fatcat:7lj5z7tfzzawrpt63i4pvfguhi

Design Principles for Synthesizable Processor Cores [chapter]

Pascal Schleuniger, Sally A. McKee, Sven Karlsson
2012 Lecture Notes in Computer Science  
As FPGAs get more competitive, synthesizable processor cores become an attractive choice for embedded computing.  ...  To evaluate their effects, we develop Tinuso, a processor architecture optimized for FPGA implementation.  ...  The authors acknowledge the HiPEAC 2 European Network of Excellence.  ... 
doi:10.1007/978-3-642-28293-5_10 fatcat:qhrpdxwnubdvnk4zayrg7yr2v4

Predicate prediction for efficient out-of-order execution

Weihaw Chuang, Brad Calder
2003 Proceedings of the 17th annual international conference on Supercomputing - ICS '03  
Predicated execution is an important optimization even for an out-of-order processor, since it can eliminate hard to predict branches and help to enable software pipelining.  ...  Using predication with out-of-order execution creates a naming bottleneck, because there can be multiple definitions reaching a use, and not knowing which use is the correct one can stall the processor  ...  We especially would like to thank Intel for providing the Electron compiler sources, and their assistance in using it.  ... 
doi:10.1145/782837.782840 fatcat:dj43frlsubgotcczgcdo2434ie

Predicate prediction for efficient out-of-order execution

Weihaw Chuang, Brad Calder
2003 Proceedings of the 17th annual international conference on Supercomputing - ICS '03  
Predicated execution is an important optimization even for an out-of-order processor, since it can eliminate hard to predict branches and help to enable software pipelining.  ...  Using predication with out-of-order execution creates a naming bottleneck, because there can be multiple definitions reaching a use, and not knowing which use is the correct one can stall the processor  ...  We especially would like to thank Intel for providing the Electron compiler sources, and their assistance in using it.  ... 
doi:10.1145/782814.782840 dblp:conf/ics/ChuangC03 fatcat:demrdw43czdurcqquziygfyji4

Speeding up control-dominated applications through microarchitectural customizations in embedded processors

Peter Petrov, Alex Orailoglu
2001 Proceedings of the 38th conference on Design automation - DAC '01  
We present a methodology for microarchitectural customization of embedded processors by exploiting application information, thus attaining the twin benefits of processor standardization and applicationspecific  ...  processors in complex co-designs for control intensive systems. ½ This assembly code is part of the ADPCM Encode benchmark [8] and was produced by gcc for the SimpleScalar toolset [9] .  ...  For embedded processors, which due to more stringent power consumption limitations lack the capability for multiple instruction issue and out-of-order execution, the time interval between the branch condition  ... 
doi:10.1145/378239.379014 dblp:conf/dac/PetrovO01 fatcat:uiukhdptgjgc3ojpl5t5d5jw2q

Merge

Michael D. Linderman, Jamison D. Collins, Hong Wang, Teresa H. Meng
2008 Proceedings of the 13th international conference on Architectural support for programming languages and operating systems - ASPLOS XIII  
SMP system with Intel Xeon processors.  ...  The Merge framework provides (1) a predicate dispatch-based library system for managing and invoking function variants for multiple architectures; (2) a high-level, library-oriented parallel language based  ...  Acknowledgments We would like to thank Perry Wang, Hong Jiang, Xinmin Tian, and Ghassan Yacoub for their help with this project. We also appreciate the support of Shekhar Borkar and Joe Schutz.  ... 
doi:10.1145/1346281.1346318 dblp:conf/asplos/LindermanCWM08 fatcat:o32q3ujmgba7pkah7olpyh3qkm

Merge

Michael D. Linderman, Jamison D. Collins, Hong Wang, Teresa H. Meng
2008 SIGARCH Computer Architecture News  
SMP system with Intel Xeon processors.  ...  The Merge framework provides (1) a predicate dispatch-based library system for managing and invoking function variants for multiple architectures; (2) a high-level, library-oriented parallel language based  ...  Acknowledgments We would like to thank Perry Wang, Hong Jiang, Xinmin Tian, and Ghassan Yacoub for their help with this project. We also appreciate the support of Shekhar Borkar and Joe Schutz.  ... 
doi:10.1145/1353534.1346318 fatcat:vze7gnq63veolbnomrmid7mlae

A reprogrammable customization framework for efficient branch resolution in embedded processors

Peter Petrov, Alex Orailoglu
2005 ACM Transactions on Embedded Computing Systems  
Experimental results show that for a representative set of control-dominated applications a reduction in the range of 3-22% in processor cycles can be achieved, thus extending the scope of low-cost embedded  ...  processors in complex codesigns for control intensive systems.  ...  For embedded processors, which due to more stringent power consumption limitations lack the capability for multiple instruction issue and out-of-order execution, the time interval between the branch condition  ... 
doi:10.1145/1067915.1067924 fatcat:q5u7lok6fzcppkggatkek6epoa

Merge

Michael D. Linderman, Jamison D. Collins, Hong Wang, Teresa H. Meng
2008 SIGPLAN notices  
SMP system with Intel Xeon processors.  ...  The Merge framework provides (1) a predicate dispatch-based library system for managing and invoking function variants for multiple architectures; (2) a high-level, library-oriented parallel language based  ...  Acknowledgments We would like to thank Perry Wang, Hong Jiang, Xinmin Tian, and Ghassan Yacoub for their help with this project. We also appreciate the support of Shekhar Borkar and Joe Schutz.  ... 
doi:10.1145/1353536.1346318 fatcat:qy4xcdxgufbtvatceo4i5ncoqy

Merge

Michael D. Linderman, Jamison D. Collins, Hong Wang, Teresa H. Meng
2008 ACM SIGOPS Operating Systems Review  
SMP system with Intel Xeon processors.  ...  The Merge framework provides (1) a predicate dispatch-based library system for managing and invoking function variants for multiple architectures; (2) a high-level, library-oriented parallel language based  ...  Acknowledgments We would like to thank Perry Wang, Hong Jiang, Xinmin Tian, and Ghassan Yacoub for their help with this project. We also appreciate the support of Shekhar Borkar and Joe Schutz.  ... 
doi:10.1145/1353535.1346318 fatcat:g64ic7s6rrgc7bjgi7lsy3ajv4

Towards an Area-Efficient Implementation of a High ILP EDGE Soft Processor [article]

Jan Gray, Aaron Smith
2018 arXiv   pre-print
In-order scalar RISC architectures have been the dominant paradigm in FPGA soft processor design for twenty years.  ...  This paper describes a new way to build fast and area-efficient out-of-order superscalar soft processors by utilizing an Explicit Data Graph Execution (EDGE) instruction set architecture.  ...  Together the EDGE architecture and its compiler finesse away much of the register renaming, CAMs, and complexity, enabling an out-of-order processor for only a few hundred LUTs more than an in-order scalar  ... 
arXiv:1803.06617v1 fatcat:jor32rhzk5cxfmraiyrpyl7zti

SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores

Kim-Anh Tran, Alexandra Jimborean, Trevor E. Carlson, Konstantinos Koukos, Magnus Själander, Stefanos Kaxiras
2018 Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2018  
While out-of-order (OoO) architectures attempt to hide memory latency by dynamically reordering instructions, they do so through expensive, power-hungry, speculative mechanisms.  ...  Our SWOOP compiler is enhanced with lightweight architectural support, thus being able to transform applications that include highly complex control-low and indirect memory accesses.  ...  Acknowledgements This work is supported, in part, by the Swedish Research Council UPMARC Linnaeus Centre and by the Swedish VR (grant no. 2016-05086).  ... 
doi:10.1145/3192366.3192393 dblp:conf/pldi/TranJCKSK18 fatcat:jsvxxfqkzvgrnnrtl7spfd4y6a
« Previous Showing results 1 — 15 out of 3,105 results