A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
Filters
A New Dataflow Compiler IR for Accelerating Control-Intensive Code in Spatial Hardware
2014
2014 IEEE International Parallel & Distributed Processing Symposium Workshops
highlevel synthesis that enables aggressive exposition of ILP even in the presence of complex control flow. ...
code is often worse than what is achievable using conventional superscalar processors. ...
Control-flow speculation via Branch Prediction: Modern superscalar processors utilize aggressive branch prediction to relax the constraints imposed by control flow on ILP: branch predictors with very high ...
doi:10.1109/ipdpsw.2014.18
dblp:conf/ipps/ZaidiG14
fatcat:tv5r6upngbf5bhicyzxjo6mjca
Beyond Dataflow
2000
Journal of Computing and Information Technology
Also some other techniques for combining control-flow and dataflow emerged, such as coarse-grain dataflow, dataflow with complex machine operations, RISC dataflow, and micro dataflow. ...
These developments have also had certain impact on the conception of highperformance superscalar processors in the "post-RISC" era. ...
A solution to these problems is to combine dataflow with control-flow mechanisms. ...
doi:10.2498/cit.2000.02.01
fatcat:3bonvcsg6jbnzj3uouzwkc5tcm
Optimizing Static Power Dissipation by Functional Units in Superscalar Processors
[chapter]
2002
Lecture Notes in Computer Science
We present a novel approach which combines compiler, instruction set, and microarchitecture support to turn off functional units that are idle for long periods of time for reducing static power dissipation ...
idle regions and directives for turning them back on at exits from such regions. ...
On and off semantics for an out-of-order superscalar processor. ...
doi:10.1007/3-540-45937-5_19
fatcat:4eqjuh75yjbs7nvdbejpwyuqca
Selecting Computer Architectures by Means of Control-Flow-Graph Mining
[chapter]
2009
Lecture Notes in Computer Science
We correlate substructures of the control-flow graphs representing the individual functions with the runtime on certain systems. ...
In our evaluation with the SPEC CPU 2000 and 2006 benchmarks, we predict the faster system out of two with high accuracy and achieve significant speedups in execution time. ...
Acknowledgments We thank Dietmar Hauf for much help with all aspects of this study and Wolfgang Karl and David Kramer for their guidance regarding computer architecture. ...
doi:10.1007/978-3-642-03915-7_27
fatcat:lsgijnyffba63a2pvnww5hpguq
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs
2006
Microarchitecture (MICRO), Proceedings of the Annual International Symposium on
control flow and data structures. ...
flow to threads. ...
Acknowledgments We would like to thank Sami Yehia, from ARM, for his support and many helpful suggestions. ...
doi:10.1109/micro.2006.13
dblp:conf/micro/PalatinLT06
fatcat:2e43ekij3zb2hizpllqyiz5u4u
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
2005
SIGARCH Computer Architecture News
Hardware accelerators are added to the processor to execute the collapsed subgraphs. ...
The compiler is responsible for identifying profitable subgraphs, while the hardware handles discovery, mapping, and execution of compatible subgraphs. ...
This allows the control flow generator to eliminate spill code of transient values within the subgraph. ...
doi:10.1145/1080695.1069993
fatcat:2mqcqdq7fnh3hmuoskk6czjqpy
A trace cache microarchitecture and evaluation
1999
IEEE transactions on computers
The microarchitecture provides high instruction fetch bandwidth with low latency by explicitly sequencing through the program at the higher level of traces, both in terms of (1) control flow prediction ...
As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase. ...
We would also like to give special thanks to Quinn Jacobson for his valuable input and for providing access to next trace prediction simulators. ...
doi:10.1109/12.752652
fatcat:5nrm3ihc5rcpzlqop3dkpmkcjq
Exploiting short-lived variables in superscalar processors
1995
Proceedings of the 28th Annual International Symposium on Microarchitecture
Another important issue for superscalar processors is to be able to deal with dependencies between instructions that access memory. ...
Superscalar Processors Superscalar machines have become the standard type of processor implementation for the modern general purpose microprocessors. ...
doi:10.1109/micro.1995.476839
dblp:conf/micro/LozanoG95
fatcat:jjqt7xr3mfb7lic5oe2eetem3q
An investigation of the performance of various instruction-issue buffer topologies
1995
Proceedings of the 28th Annual International Symposium on Microarchitecture
Another important issue for superscalar processors is to be able to deal with dependencies between instructions that access memory. ...
Superscalar Processors Superscalar machines have become the standard type of processor implementation for the modern general purpose microprocessors. ...
doi:10.1109/micro.1995.476837
dblp:conf/micro/JourdanSL95
fatcat:wuhuhegebrcnjfzysttify635q
Global Scheduling Heuristics for Multicore Architecture
2015
Scientific Programming
This work discusses various compiler level global scheduling techniques for multicore processors. ...
In conjunction with parallelization techniques, locality optimizations are performed to minimize communication overhead between the cores. ...
Acknowledgment The authors thank Nick Johnson of University of Virginia for providing the compiler and its assembler. ...
doi:10.1155/2015/860891
fatcat:xb4a6nrg2jft5ksgsknzssaoui
Application-Specific Processors
[chapter]
2017
Handbook of Hardware/Software Codesign
General-Purpose Processors (GPPs) and Application-Specific Integrated Circuits (ASICs) are the two extreme choices for computational engines. ...
An application-specific processor architecture augments the base instruction-set architecture with customized instructions that encapsulate the frequently occurring computational patterns within an application ...
[52]
Fig. 12. 3 3 Control-flow graph and data-flow graph point. ...
doi:10.1007/978-94-017-7267-9_13
fatcat:caf5pmws2rb4pcf3w3eugrhknu
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores
2007
Microarchitecture (MICRO), Proceedings of the Annual International Symposium on
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately. ...
We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores. ...
We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments. ...
doi:10.1109/micro.2007.4408257
fatcat:fh4t4wiz45cizlkvq53dsyqeqy
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores
2008
IEEE Micro
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately. ...
We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores. ...
We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments. ...
doi:10.1109/mm.2008.3
fatcat:5wkl4zbvgbhhja3arzpdow44zi
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores
2007
40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)
The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks-control flow, dataflow, computation, and memory access-that can be checked separately. ...
We have developed Argus, a novel approach for providing low-cost, comprehensive error detection for simple cores. ...
We thank Fred Bower, Derek Hower, Alvy Lebeck, Anita Lungu, and Bogdan Romanescu for feedback on this work. We thank Bogdan Romanescu and Heather Sarik for help with the experiments. ...
doi:10.1109/micro.2007.18
dblp:conf/micro/MeixnerBS07
fatcat:g74a6zasczcw3jk4bmwwhrdouy
Hardware compilation of application-specific memory-access interconnect
2006
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
The latency to access memory is often not statically predictable, which creates problems for scheduling operations dependent on memory reads. ...
Addressing these issues with static scheduling results in overly conservative circuits, and thus, most state-of-the-art HLS tools limit memory systems to those that have predictable latencies and limit ...
The superscalar core is representative of all processor-and platform-based approaches to supporting memory accesses. ...
doi:10.1109/tcad.2006.870411
fatcat:ta6m6ivhabdrdkcebdwinib7gu
« Previous
Showing results 1 — 15 out of 88 results