Filters








88 Hits in 4.8 sec

A New Dataflow Compiler IR for Accelerating Control-Intensive Code in Spatial Hardware

Ali Mustafa Zaidi, David Greaves
2014 2014 IEEE International Parallel & Distributed Processing Symposium Workshops  
highlevel synthesis that enables aggressive exposition of ILP even in the presence of complex control flow.  ...  code is often worse than what is achievable using conventional superscalar processors.  ...  Control-flow speculation via Branch Prediction: Modern superscalar processors utilize aggressive branch prediction to relax the constraints imposed by control flow on ILP: branch predictors with very high  ... 
doi:10.1109/ipdpsw.2014.18 dblp:conf/ipps/ZaidiG14 fatcat:tv5r6upngbf5bhicyzxjo6mjca

Beyond Dataflow

Borut Robi�, Jurij �ilc, Theo Ungerer
2000 Journal of Computing and Information Technology  
Also some other techniques for combining control-flow and dataflow emerged, such as coarse-grain dataflow, dataflow with complex machine operations, RISC dataflow, and micro dataflow.  ...  These developments have also had certain impact on the conception of highperformance superscalar processors in the "post-RISC" era.  ...  A solution to these problems is to combine dataflow with control-flow mechanisms.  ... 
doi:10.2498/cit.2000.02.01 fatcat:3bonvcsg6jbnzj3uouzwkc5tcm

Optimizing Static Power Dissipation by Functional Units in Superscalar Processors [chapter]

Siddharth Rele, Santosh Pande, Soner Onder, Rajiv Gupta
2002 Lecture Notes in Computer Science  
We present a novel approach which combines compiler, instruction set, and microarchitecture support to turn off functional units that are idle for long periods of time for reducing static power dissipation  ...  idle regions and directives for turning them back on at exits from such regions.  ...  On and off semantics for an out-of-order superscalar processor.  ... 
doi:10.1007/3-540-45937-5_19 fatcat:4eqjuh75yjbs7nvdbejpwyuqca

Selecting Computer Architectures by Means of Control-Flow-Graph Mining [chapter]

Frank Eichinger, Klemens Böhm
2009 Lecture Notes in Computer Science  
We correlate substructures of the control-flow graphs representing the individual functions with the runtime on certain systems.  ...  In our evaluation with the SPEC CPU 2000 and 2006 benchmarks, we predict the faster system out of two with high accuracy and achieve significant speedups in execution time.  ...  Acknowledgments We thank Dietmar Hauf for much help with all aspects of this study and Wolfgang Karl and David Kramer for their guidance regarding computer architecture.  ... 
doi:10.1007/978-3-642-03915-7_27 fatcat:lsgijnyffba63a2pvnww5hpguq

CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs

Pierre Palatin, Yves Lhuillier, Olivier Temam
2006 Microarchitecture (MICRO), Proceedings of the Annual International Symposium on  
control flow and data structures.  ...  flow to threads.  ...  Acknowledgments We would like to thank Sami Yehia, from ARM, for his support and many helpful suggestions.  ... 
doi:10.1109/micro.2006.13 dblp:conf/micro/PalatinLT06 fatcat:2e43ekij3zb2hizpllqyiz5u4u

An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, Krisztian Flautner
2005 SIGARCH Computer Architecture News  
Hardware accelerators are added to the processor to execute the collapsed subgraphs.  ...  The compiler is responsible for identifying profitable subgraphs, while the hardware handles discovery, mapping, and execution of compatible subgraphs.  ...  This allows the control flow generator to eliminate spill code of transient values within the subgraph.  ... 
doi:10.1145/1080695.1069993 fatcat:2mqcqdq7fnh3hmuoskk6czjqpy

A trace cache microarchitecture and evaluation

E. Rotenberg, S. Bennett, J.E. Smith
1999 IEEE transactions on computers  
The microarchitecture provides high instruction fetch bandwidth with low latency by explicitly sequencing through the program at the higher level of traces, both in terms of (1) control flow prediction  ...  As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase.  ...  We would also like to give special thanks to Quinn Jacobson for his valuable input and for providing access to next trace prediction simulators.  ... 
doi:10.1109/12.752652 fatcat:5nrm3ihc5rcpzlqop3dkpmkcjq

Exploiting short-lived variables in superscalar processors

L.A. Lozano, G.R. Gao
1995 Proceedings of the 28th Annual International Symposium on Microarchitecture  
Another important issue for superscalar processors is to be able to deal with dependencies between instructions that access memory.  ...  Superscalar Processors Superscalar machines have become the standard type of processor implementation for the modern general purpose microprocessors.  ... 
doi:10.1109/micro.1995.476839 dblp:conf/micro/LozanoG95 fatcat:jjqt7xr3mfb7lic5oe2eetem3q

An investigation of the performance of various instruction-issue buffer topologies

S. Jourdan, P. Sainrat, D. Litaize
1995 Proceedings of the 28th Annual International Symposium on Microarchitecture  
Another important issue for superscalar processors is to be able to deal with dependencies between instructions that access memory.  ...  Superscalar Processors Superscalar machines have become the standard type of processor implementation for the modern general purpose microprocessors.  ... 
doi:10.1109/micro.1995.476837 dblp:conf/micro/JourdanSL95 fatcat:wuhuhegebrcnjfzysttify635q

Global Scheduling Heuristics for Multicore Architecture

D. C. Kiran, S. Gurunarayanan, Janardan Prasad Misra, Abhijeet Nawal
2015 Scientific Programming  
This work discusses various compiler level global scheduling techniques for multicore processors.  ...  In conjunction with parallelization techniques, locality optimizations are performed to minimize communication overhead between the cores.  ...  Acknowledgment The authors thank Nick Johnson of University of Virginia for providing the compiler and its assembler.  ... 
doi:10.1155/2015/860891 fatcat:xb4a6nrg2jft5ksgsknzssaoui

Application-Specific Processors [chapter]

Tulika Mitra
2017 Handbook of Hardware/Software Codesign  
General-Purpose Processors (GPPs) and Application-Specific Integrated Circuits (ASICs) are the two extreme choices for computational engines.  ...  An application-specific processor architecture augments the base instruction-set architecture with customized instructions that encapsulate the frequently occurring computational patterns within an application  ...  [52] Fig. 12. 3 3 Control-flow graph and data-flow graph point.  ... 
doi:10.1007/978-94-017-7267-9_13 fatcat:caf5pmws2rb4pcf3w3eugrhknu

Argus: Low-Cost, Comprehensive Error Detection in Simple Cores

Albert Meixner, Michael E. Bauer, Daniel Sorin
2007 Microarchitecture (MICRO), Proceedings of the Annual International Symposium on  
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately.  ...  We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores.  ...  We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments.  ... 
doi:10.1109/micro.2007.4408257 fatcat:fh4t4wiz45cizlkvq53dsyqeqy

Argus: Low-Cost, Comprehensive Error Detection in Simple Cores

Albert Meixner, Michael E. Bauer, Daniel J. Sorin
2008 IEEE Micro  
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately.  ...  We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores.  ...  We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments.  ... 
doi:10.1109/mm.2008.3 fatcat:5wkl4zbvgbhhja3arzpdow44zi

Argus: Low-Cost, Comprehensive Error Detection in Simple Cores

Albert Meixner, Michael E. Bauer, Daniel Sorin
2007 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)  
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately.  ...  We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores.  ...  We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments.  ... 
doi:10.1109/micro.2007.18 dblp:conf/micro/MeixnerBS07 fatcat:g74a6zasczcw3jk4bmwwhrdouy

Hardware compilation of application-specific memory-access interconnect

G. Venkataramani, T. Bjerregaard, T. Chelcea, S.C. Goldstein
2006 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
The latency to access memory is often not statically predictable, which creates problems for scheduling operations dependent on memory reads.  ...  Addressing these issues with static scheduling results in overly conservative circuits, and thus, most state-of-the-art HLS tools limit memory systems to those that have predictable latencies and limit  ...  The superscalar core is representative of all processor-and platform-based approaches to supporting memory accesses.  ... 
doi:10.1109/tcad.2006.870411 fatcat:ta6m6ivhabdrdkcebdwinib7gu
« Previous Showing results 1 — 15 out of 88 results