A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Slice-processors
2001
Proceedings of the 15th international conference on Supercomputing - ICS '01
Finally, we study how our operationbased predictor interacts with an outcome-based one and find them mutually beneficial. ...
We describe the Slice Processor micro-architecture that implements a generalized operation-based prefetching mechanism. ...
This work was supported in part by an NSF CAREER award and by funds from the University of Toronto. ...
doi:10.1145/377792.377856
dblp:conf/ics/MoshovosPB01
fatcat:etizvwwumfffhpzk5bfl2y72ji
Multithreaded Processors
2002
Computer journal
The chip multiprocessor integrates two or more complete processors on a single chip. Every unit of a processor is duplicated and used independently of its copies on the chip. ...
The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors. ...
The SMP and the DSM multiprocessors feature a common address space, which is implemented in the SMP as a single global memory where each memory word can be accessed in uniform access time by all processors ...
doi:10.1093/comjnl/45.3.320
fatcat:hlkkabuhrzhkrmuyqomzfmc6zm
Multimedia processors
1998
Proceedings of the IEEE
This paper describes recent large-scale-integration programmable processors designed for multimedia processing such as real-time compression and decompression of audio and video as well as the generation ...
As the target of these processors is to handle audio and video in real time, the processing capability must be increased tenfold compared to that of conventional microprocessors, which were designed to ...
These processors were designed with the target of realizing a software-implemented MPEG-1/2 encoder/decoder in real time. ...
doi:10.1109/5.687835
fatcat:uqmrprob5rbwzgodp6fgq4buea
Multi-Threaded Processors
[chapter]
2011
Encyclopedia of Parallel Computing
The chip multiprocessor integrates two or more complete processors on a single chip. Every unit of a processor is duplicated and used independently of its copies on the chip. ...
The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors. ...
The SMP and the DSM multiprocessors feature a common address space, which is implemented in the SMP as a single global memory where each memory word can be accessed in uniform access time by all processors ...
doi:10.1007/978-0-387-09766-4_423
fatcat:heb3n2cfwnbi5nvxv5kvxd2xgm
Slipstream processors
2000
SIGPLAN notices
The shortened program is run concurrently with the full program on a chip multiprocessor or simultaneous multithreaded processor, with two key advantages: 1) Improved single-program performance. ...
Note that two kinds of speculation occur in the A-stream. Conventional speculation occurs when branches are predicted and the branch-related computation has not been removed from the A-stream. ...
We are grateful to Jim Smith for suggesting the name "slipstream" and pointing out the useful car racing analogy. ...
doi:10.1145/356989.357013
fatcat:vs4txm2jsbhfzfegv3drxfo4c4
Evaluating Various Branch-Prediction Schemes for Biomedical-Implant Processors
2009
2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors
Our profiling study has revealed that, under strict or relaxed area constraints and regardless of cache size, the ALWAYS TAKEN and ALWAYS NOT-TAKEN static prediction schemes are, in almost all cases, the ...
Results are used to drive the (micro)architectural design of a novel microprocessor targeting microelectronic implants. ...
We, thus, offer insights on the design and implementation of the branch-prediction subsystem of our targeted processor. ...
doi:10.1109/asap.2009.37
dblp:conf/asap/StrydisG09
fatcat:xfqqctjrlvcj7gkyxl3aycmyx4
A Systematic Design Space Exploration Approach to Customising Multi-Processor Architectures: Exemplified Using Graphics Processors
[chapter]
2011
Lecture Notes in Computer Science
A systematic approach to customising Homogeneous Multi-Processor (HoMP) architectures is described. The approach involves a novel design space exploration tool and a parameterisable system model. ...
We also analyse on-chip and off-chip memory access for systems with one or more processing elements (PEs), and study the impact of the number of threads per PE on the amount of off-chip memory access and ...
To explore the design space of the graphics processor for the case study of this work a high-level system model is shown in Figure 4 . ...
doi:10.1007/978-3-642-24568-8_4
fatcat:7r43f3e5hjhdje5g7njmb2zczq
Scale-out processors
2012
2012 39th Annual International Symposium on Computer Architecture (ISCA)
In this work, we introduce a methodology for designing scalable and efficient scale-out server processors. ...
Moreover, as each pod is a stand-alone server, scale-out processors avoid the expense of global (i.e., interpod) interconnect and coherence. ...
This work was partially supported by EuroCloud, Project No 247779 of the European Commission 7th RTD Framework Programme -Information and Communication Technologies: Computing Systems. ...
doi:10.1109/isca.2012.6237043
dblp:conf/isca/Lotfi-KamranGFVKPAJIOF12
fatcat:tr4ecb3mxrbfbj7dvfphtqiqhq
Scale-out processors
2012
SIGARCH Computer Architecture News
In this work, we introduce a methodology for designing scalable and efficient scale-out server processors. ...
Moreover, as each pod is a stand-alone server, scale-out processors avoid the expense of global (i.e., interpod) interconnect and coherence. ...
This work was partially supported by EuroCloud, Project No 247779 of the European Commission 7th RTD Framework Programme -Information and Communication Technologies: Computing Systems. ...
doi:10.1145/2366231.2337217
fatcat:vm5i3gszjbb4zdurf2wdfr7sjm
Data caches for superscalar processors
1997
Proceedings of the 11th international conference on Supercomputing - ICS '97
Because of the high cost of true multi-ported caches, alternative cache designs must be evaluated. ...
As the number of instructions executed in parallel increases, superscalar processors will require higher bandwidth from data caches. ...
For n accesses, the cache must be replicated n times with no benefit to storage space. ...
doi:10.1145/263580.263595
dblp:conf/ics/JuanNT97
fatcat:6322isyfkreedostpdo5rjzsia
Design Principles for Synthesizable Processor Cores
[chapter]
2012
Lecture Notes in Computer Science
To evaluate their effects, we develop Tinuso, a processor architecture optimized for FPGA implementation. ...
We demonstrate through the use of micro-benchmarks that our principles guide the design of a processor core that improves performance by an average of 38% over a similar Xilinx MicroBlaze configuration ...
The authors acknowledge the HiPEAC 2 European Network of Excellence. ...
doi:10.1007/978-3-642-28293-5_10
fatcat:qhrpdxwnubdvnk4zayrg7yr2v4
Cross-profiling for Java processors
2009
Software, Practice & Experience
As a case study, we explore the performance impact of various processor design choices and optimizations, such as different cache sizes or pipeline organizations, and come up with an improved processor ...
As case study, we employ cross-profiling in order to quantitatively explore the performance impact of various processor design options and optimizations. ...
As a case study, we have presented an approach to computer architecture evaluation for embedded Java processors using cross-profiling. ...
doi:10.1002/spe.940
fatcat:gaaerph7pfdlvpusj7fapzsfnq
Scaling Soft Processor Systems
2008
2008 16th International Symposium on Field-Programmable Custom Computing Machines
In particular we design and evaluate real FPGA-based processor, multithreaded processor, and multiprocessor systems on EEMBC benchmarks-investigating different approaches to scaling caches, processors, ...
In contrast with previous studies of systems with on-chip memory [3, 4] , with off-chip memory we find that single-threaded processors are generally more area-efficient than multithreaded processors. ...
We first observe that the single-threaded and different multithreaded processor designs with various cache sizes allow us to span a broad range of the performance/area space, giving a system designer interested ...
doi:10.1109/fccm.2008.8
dblp:conf/fccm/LabrecqueYS08
fatcat:nyll63y5grahtkyp6udgmx3y7q
Chip multi-processor generator
2007
Proceedings - Design Automation Conference
Since the power and area of the chip are limited, a compromise among the expected use-cases is typically implemented. ...
Furthermore, every time a chip is built, we inherently evaluate different design decisions, either implicitly using micro-architectural and domain knowledge, or explicitly through custom evaluation tools ...
Upon compile time of the code, each file is first pre-processed and all the embedded pre-processor directives are evaluated to create a new text file. ...
doi:10.1145/1278480.1278544
dblp:conf/dac/SolomatnikovFQSKAWHH07
fatcat:r5cfnoxqarg5lghtnmqidi7wxy
Prototyping Framework for Reconfigurable Processors
[chapter]
2001
Lecture Notes in Computer Science
A couple of concept and prototyping studies have introduced the reconfigurability within general purpose microprocessor world. ...
The work differs from the previous approaches in the fact that a systematical way (concerning both hardware and software sides) to design, test and debug a class of reconfigurable computing cores instead ...
The prototyping study introduced in this paper attempts to find a systematic approach to design and test of small reconfigurable processor cores. The rest of this paper is organized as follows. ...
doi:10.1007/3-540-44687-7_2
fatcat:hwct5zh2cjbmnlq6h6fvib4qwu
« Previous
Showing results 1 — 15 out of 92,294 results