Filters








92,294 Hits in 7.9 sec

Slice-processors

Andreas Moshovos, Dionisios N. Pnevmatikatos, Amirali Baniasadi
2001 Proceedings of the 15th international conference on Supercomputing - ICS '01  
Finally, we study how our operationbased predictor interacts with an outcome-based one and find them mutually beneficial.  ...  We describe the Slice Processor micro-architecture that implements a generalized operation-based prefetching mechanism.  ...  This work was supported in part by an NSF CAREER award and by funds from the University of Toronto.  ... 
doi:10.1145/377792.377856 dblp:conf/ics/MoshovosPB01 fatcat:etizvwwumfffhpzk5bfl2y72ji

Multithreaded Processors

T. Ungerer
2002 Computer journal  
The chip multiprocessor integrates two or more complete processors on a single chip. Every unit of a processor is duplicated and used independently of its copies on the chip.  ...  The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors.  ...  The SMP and the DSM multiprocessors feature a common address space, which is implemented in the SMP as a single global memory where each memory word can be accessed in uniform access time by all processors  ... 
doi:10.1093/comjnl/45.3.320 fatcat:hlkkabuhrzhkrmuyqomzfmc6zm

Multimedia processors

I. Kuroda, T. Nishitani
1998 Proceedings of the IEEE  
This paper describes recent large-scale-integration programmable processors designed for multimedia processing such as real-time compression and decompression of audio and video as well as the generation  ...  As the target of these processors is to handle audio and video in real time, the processing capability must be increased tenfold compared to that of conventional microprocessors, which were designed to  ...  These processors were designed with the target of realizing a software-implemented MPEG-1/2 encoder/decoder in real time.  ... 
doi:10.1109/5.687835 fatcat:uqmrprob5rbwzgodp6fgq4buea

Multi-Threaded Processors [chapter]

David Padua, Amol Ghoting, John A. Gunnels, Mark S. Squillante, José Meseguer, James H. Cownie, Duncan Roweth, Sarita V. Adve, Hans J. Boehm, Sally A. McKee, Robert W. Wisniewski, George Karypis (+29 others)
2011 Encyclopedia of Parallel Computing  
The chip multiprocessor integrates two or more complete processors on a single chip. Every unit of a processor is duplicated and used independently of its copies on the chip.  ...  The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors.  ...  The SMP and the DSM multiprocessors feature a common address space, which is implemented in the SMP as a single global memory where each memory word can be accessed in uniform access time by all processors  ... 
doi:10.1007/978-0-387-09766-4_423 fatcat:heb3n2cfwnbi5nvxv5kvxd2xgm

Slipstream processors

Karthik Sundaramoorthy, Zach Purser, Eric Rotenberg
2000 SIGPLAN notices  
The shortened program is run concurrently with the full program on a chip multiprocessor or simultaneous multithreaded processor, with two key advantages: 1) Improved single-program performance.  ...  Note that two kinds of speculation occur in the A-stream. Conventional speculation occurs when branches are predicted and the branch-related computation has not been removed from the A-stream.  ...  We are grateful to Jim Smith for suggesting the name "slipstream" and pointing out the useful car racing analogy.  ... 
doi:10.1145/356989.357013 fatcat:vs4txm2jsbhfzfegv3drxfo4c4

Evaluating Various Branch-Prediction Schemes for Biomedical-Implant Processors

Christos Strydis, Georgi N. Gaydadjiev
2009 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors  
Our profiling study has revealed that, under strict or relaxed area constraints and regardless of cache size, the ALWAYS TAKEN and ALWAYS NOT-TAKEN static prediction schemes are, in almost all cases, the  ...  Results are used to drive the (micro)architectural design of a novel microprocessor targeting microelectronic implants.  ...  We, thus, offer insights on the design and implementation of the branch-prediction subsystem of our targeted processor.  ... 
doi:10.1109/asap.2009.37 dblp:conf/asap/StrydisG09 fatcat:xfqqctjrlvcj7gkyxl3aycmyx4

A Systematic Design Space Exploration Approach to Customising Multi-Processor Architectures: Exemplified Using Graphics Processors [chapter]

Ben Cope, Peter Y. K. Cheung, Wayne Luk, Lee Howes
2011 Lecture Notes in Computer Science  
A systematic approach to customising Homogeneous Multi-Processor (HoMP) architectures is described. The approach involves a novel design space exploration tool and a parameterisable system model.  ...  We also analyse on-chip and off-chip memory access for systems with one or more processing elements (PEs), and study the impact of the number of threads per PE on the amount of off-chip memory access and  ...  To explore the design space of the graphics processor for the case study of this work a high-level system model is shown in Figure 4 .  ... 
doi:10.1007/978-3-642-24568-8_4 fatcat:7r43f3e5hjhdje5g7njmb2zczq

Scale-out processors

Pejman Lotfi-Kamran, Boris Grot, Michael Ferdman, Stavros Volos, Onur Kocberber, Javier Picorel, Almutaz Adileh, Djordje Jevdjic, Sachin Idgunji, Emre Ozer, Babak Falsafi
2012 2012 39th Annual International Symposium on Computer Architecture (ISCA)  
In this work, we introduce a methodology for designing scalable and efficient scale-out server processors.  ...  Moreover, as each pod is a stand-alone server, scale-out processors avoid the expense of global (i.e., interpod) interconnect and coherence.  ...  This work was partially supported by EuroCloud, Project No 247779 of the European Commission 7th RTD Framework Programme -Information and Communication Technologies: Computing Systems.  ... 
doi:10.1109/isca.2012.6237043 dblp:conf/isca/Lotfi-KamranGFVKPAJIOF12 fatcat:tr4ecb3mxrbfbj7dvfphtqiqhq

Scale-out processors

Pejman Lotfi-Kamran, Emre Ozer, Babak Falsafi, Boris Grot, Michael Ferdman, Stavros Volos, Onur Kocberber, Javier Picorel, Almutaz Adileh, Djordje Jevdjic, Sachin Idgunji
2012 SIGARCH Computer Architecture News  
In this work, we introduce a methodology for designing scalable and efficient scale-out server processors.  ...  Moreover, as each pod is a stand-alone server, scale-out processors avoid the expense of global (i.e., interpod) interconnect and coherence.  ...  This work was partially supported by EuroCloud, Project No 247779 of the European Commission 7th RTD Framework Programme -Information and Communication Technologies: Computing Systems.  ... 
doi:10.1145/2366231.2337217 fatcat:vm5i3gszjbb4zdurf2wdfr7sjm

Data caches for superscalar processors

Toni Juan, Juan J. Navarro, Olivier Temam
1997 Proceedings of the 11th international conference on Supercomputing - ICS '97  
Because of the high cost of true multi-ported caches, alternative cache designs must be evaluated.  ...  As the number of instructions executed in parallel increases, superscalar processors will require higher bandwidth from data caches.  ...  For n accesses, the cache must be replicated n times with no benefit to storage space.  ... 
doi:10.1145/263580.263595 dblp:conf/ics/JuanNT97 fatcat:6322isyfkreedostpdo5rjzsia

Design Principles for Synthesizable Processor Cores [chapter]

Pascal Schleuniger, Sally A. McKee, Sven Karlsson
2012 Lecture Notes in Computer Science  
To evaluate their effects, we develop Tinuso, a processor architecture optimized for FPGA implementation.  ...  We demonstrate through the use of micro-benchmarks that our principles guide the design of a processor core that improves performance by an average of 38% over a similar Xilinx MicroBlaze configuration  ...  The authors acknowledge the HiPEAC 2 European Network of Excellence.  ... 
doi:10.1007/978-3-642-28293-5_10 fatcat:qhrpdxwnubdvnk4zayrg7yr2v4

Cross-profiling for Java processors

Walter Binder, Martin Schoeberl, Philippe Moret, Alex Villazón
2009 Software, Practice & Experience  
As a case study, we explore the performance impact of various processor design choices and optimizations, such as different cache sizes or pipeline organizations, and come up with an improved processor  ...  As case study, we employ cross-profiling in order to quantitatively explore the performance impact of various processor design options and optimizations.  ...  As a case study, we have presented an approach to computer architecture evaluation for embedded Java processors using cross-profiling.  ... 
doi:10.1002/spe.940 fatcat:gaaerph7pfdlvpusj7fapzsfnq

Scaling Soft Processor Systems

Martin Labrecque, Peter Yiannacouras, J. Gregory Steffan
2008 2008 16th International Symposium on Field-Programmable Custom Computing Machines  
In particular we design and evaluate real FPGA-based processor, multithreaded processor, and multiprocessor systems on EEMBC benchmarks-investigating different approaches to scaling caches, processors,  ...  In contrast with previous studies of systems with on-chip memory [3, 4] , with off-chip memory we find that single-threaded processors are generally more area-efficient than multithreaded processors.  ...  We first observe that the single-threaded and different multithreaded processor designs with various cache sizes allow us to span a broad range of the performance/area space, giving a system designer interested  ... 
doi:10.1109/fccm.2008.8 dblp:conf/fccm/LabrecqueYS08 fatcat:nyll63y5grahtkyp6udgmx3y7q

Chip multi-processor generator

Alex Solomatnikov, Amin Firoozshahian, Wajahat Qadeer, Ofer Shacham, Kyle Kelley, Zain Asgar, Megan Wachs, Rehan Hameed, Mark Horowitz
2007 Proceedings - Design Automation Conference  
Since the power and area of the chip are limited, a compromise among the expected use-cases is typically implemented.  ...  Furthermore, every time a chip is built, we inherently evaluate different design decisions, either implicitly using micro-architectural and domain knowledge, or explicitly through custom evaluation tools  ...  Upon compile time of the code, each file is first pre-processed and all the embedded pre-processor directives are evaluated to create a new text file.  ... 
doi:10.1145/1278480.1278544 dblp:conf/dac/SolomatnikovFQSKAWHH07 fatcat:r5cfnoxqarg5lghtnmqidi7wxy

Prototyping Framework for Reconfigurable Processors [chapter]

Sergej Sawitzki, Steffen Köhler, Rainer G. Spallek
2001 Lecture Notes in Computer Science  
A couple of concept and prototyping studies have introduced the reconfigurability within general purpose microprocessor world.  ...  The work differs from the previous approaches in the fact that a systematical way (concerning both hardware and software sides) to design, test and debug a class of reconfigurable computing cores instead  ...  The prototyping study introduced in this paper attempts to find a systematic approach to design and test of small reconfigurable processor cores. The rest of this paper is organized as follows.  ... 
doi:10.1007/3-540-44687-7_2 fatcat:hwct5zh2cjbmnlq6h6fvib4qwu
« Previous Showing results 1 — 15 out of 92,294 results