3,143 Hits in 4.0 sec

A study of control independence in superscalar processors

E. Rotenberg, Q. Jacobson, J. Smith
1999 Proceedings Fifth International Symposium on High-Performance Computer Architecture  
Control independence has been put forward as a significant new source of instruction-level parallelism for future generation processors.  ...  Important aspects of control independence are identified and singled out for study, and a series of idealized machine models are used to isolate and evaluate these aspects.  ...  The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the  ... 
doi:10.1109/hpca.1999.744346 dblp:conf/hpca/RotenbergJS99 fatcat:ns45qyk5wzglnayipgcacseine

A single-chip multiprocessor

B.A. Nayfeh, K. Olukotun
1997 Computer  
A third form of very coarse parallelism, processlevel parallelism, involves completely independent applications running in independent processes controlled by the operating system.  ...  To find independent instructions within a sequential sequence of instructions, or thread of control, today's processors increasingly make use of sophisticated architectural features.  ... 
doi:10.1109/2.612253 fatcat:l645n6krxnaphalnk5w6pogwye

Hybrid multi-core architecture for boosting single-threaded performance

Jun Yan, Wei Zhang
2007 SIGARCH Computer Architecture News  
In this paper, we propose a compiler-driven heterogeneous multicore architecture, consisting of tightly-integrated VLIW (Very Long Instruction Word) and superscalar processors on a single chip, to automatically  ...  While multithreaded applications can naturally leverage the enhanced throughput of multi-core processors, a large number of important applications are single-threaded, which cannot automatically harness  ...  Acknowledgement This work was funded in part by NSF grant 0613244. References  ... 
doi:10.1145/1241601.1241603 fatcat:vjzotxsbo5dtvc6oe6wcifxcie

Multithreading decoupled architectures for complexity-effective general purpose computing

Michael Sung, Ronny Krashinsky, Krste Asanović
2001 SIGARCH Computer Architecture News  
Decoupled architectures have not traditionally been used in the context of general purpose computing because of their inability to tolerate control-intensive code that exists across a wide range of applications  ...  It is argued that such a decoupled architecture is more complexity-effective and scalable than comparable superscalar processors, which incorporate enormous amounts of complexity for modest performance  ...  In his preliminary study, Smith introduces the concept of a decoupled access/execute (DAE) machine.  ... 
doi:10.1145/563647.563658 fatcat:fjmdpove5ravhclvctbfurz6im

Performance estimation of multistreamed, superscalar processors

W. Yamamoto, M.J. Serrano, A.R. Talcott, R.C. Wood, M. Nemirosky
1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences HICSS-94  
In this paper, we present an analytical modeling technique to evaluate the effect of dynamically interleaving additional instruction streams within superscalar architectures.  ...  Using this technique, estimates of the instructions executed per cycle (IPC) for a processor architecture are quickly calculated given simple descriptions of the workload and hardware characteristics.  ...  In this paper, we study multistreamed, superscalar processors.  ... 
doi:10.1109/hicss.1994.323172 fatcat:fai3ploxjja3fifqufwzcfzcla

Application of software data dependency detection algorithm in superscalar computer architecture

Elena Zaharieva-Stoyanova, Lorentz Jäntschi
2003 Proceedings of the 4th international conference conference on Computer systems and technologies e-Learning - CompSysTech '03  
This paper treats the problem of detection of data hazards in superscalar execution. The algorithm of independent instruction detection is represented.  ...  The algorithm is implemented in a software simulator, which represents the way the Intel Pentium Processor works. It can be used in software module, which simulates out-of-order execution logic.  ...  Although the pipeline usage is a feature of RISC processors, this technique is used also in processors with mixed architecture -a mix of RISC and CISC.  ... 
doi:10.1145/973620.973638 fatcat:3godi24zdngobd75b2l435d4t4

Parallelism exploitation in superscalar multiprocessing

N.-P. Lu, C.-P. Chung
1998 IEE Proceedings - Computers and digital Techniques  
It was observed that the instruction-level and task-level parallelism in programs can be exploited well by a moderate degree of superscalar processing and a high degree of multiprocessing.  ...  For example, the speedup of a 32-way multiprocessor with eightissue processors can be over 200 relative to a single-issue uniprocessor.  ...  Fig. 5 illustrates the processor microarchitecture that can be modelled by SMINT. The instruction control unit is the central controller of the superscalar processor.  ... 
doi:10.1049/ip-cdt:19981955 fatcat:ih24325o5jcijevm5hwntouove

A multiprocessor architecture combining fine-grained and coarse-grained parallelism strategies

David J Lilja
1994 Parallel Computing  
These simulations indicate that the best system performance is obtained by using a mix of fine-grained and coarse-grained parallelism in which any number of processors can be used, but each processor should  ...  minimize the execution time of a single application program.  ...  A preliminary version of this work was presented at the Twenty-Fourth Annual Hawaii International Conference on System Sciences [19] .  ... 
doi:10.1016/0167-8191(94)90003-5 fatcat:5owuiokm5bha5jdcjvqg7shnja

Achieving Superscalar Performance without Superscalar Overheads - A Dataflow Compiler IR for Custom Computing

Ali Mustafa Zaidi, David J. Greaves, Marc Herbstritt
2013 Imperial College Computing Student Workshop  
a new compiler IR for high-level synthesis that enables aggressive exposition of ILP even in the presence of complex control flow.  ...  Our custom hardware is able to approach the sequential cycle-counts of an Intel Nehalem Core i7 superscalar processor, while consuming on average only 0.25× the energy of an in-order Altera Nios IIf processor  ...  The Superscalar performance advantage: A Case Study.  ... 
doi:10.4230/oasics.iccsw.2013.136 dblp:conf/iccsw/ZaidiG13 fatcat:5um2rvefbzf6do4hkrr6fgkbka

Dataflow: A Complement to Superscalar

M. Budiu, P.V. Artigas, S.C. Goldstein
2005 IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.  
In this paper we analyze the performance of a class of static dataflow machines on integer media and control-intensive programs and we explain why a dataflow machine, even with unlimited resources, does  ...  There has been a resurgence of interest in dataflow architectures, because of their potential for exploiting parallelism with low overhead.  ...  [10] proposed a superscalar microarchitecture, Skipper, that also exploits control-independence.  ... 
doi:10.1109/ispass.2005.1430572 dblp:conf/ispass/BudiuAG05 fatcat:ykp2sh7ffbbgxou4kdt7worsgy

Exploiting Statically Identified ILP for Network Processor Applications

Byeong Kil Lee
2010 International Journal of Computer and Electrical Engineering  
Network processors with various parallel architectures are appearing in the market, however, a thorough investigation of the implications of static versus dynamic scheduling of this class of emerging workloads  ...  With the large parallelism and the loop nature of network applications, our experimental analysis supports static scheduling as an appropriate strategy for network processor applications.  ...  EXPERIMENTAL FRAMEWORK In order to study the effectiveness of static and dynamic scheduling for NP applications, we perform experiments on an out-of-order superscalar processor model and a VLIW architecture  ... 
doi:10.7763/ijcee.2010.v2.236 fatcat:ksy2nvipxnbenbrs4xh63a65mu

Accurately approximating superscalar processor performance from traces

Kiyeon Lee, Shayne Evans, Sangyeun Cho
2009 2009 IEEE International Symposium on Performance Analysis of Systems and Software  
In this paper, we discuss and evaluate three strategies to quantify the impact of a long latency memory access in a superscalar processor when traces have only L1 cache misses.  ...  The dynamic nature of superscalar processors combined with the static nature of traces can lead to large inaccuracies in the results, especially when traces contain only a subset of executed instructions  ...  Acknowledgment This work was supported in part by NSF grant CCF-0702236 and an A. Richard Newton Graduate Scholarship from the 45th Design Automation Conf. (DAC).  ... 
doi:10.1109/ispass.2009.4919655 dblp:conf/ispass/LeeEC09 fatcat:hhq4ihnt7jd63dscz3b7xwh7ky

Disjoint out-of-order execution processor

Mageda Sharafeddine, Komal Jothi, Haitham Akkary
2012 ACM Transactions on Architecture and Code Optimization (TACO)  
We evaluate the potential performance of DOE processor architecture using a simple heuristic to fork control independent threads in hardware at the target addresses of future procedure return instructions  ...  High-performance superscalar architectures used to exploit instruction level parallelism in single-thread applications have become too complex and power hungry for the multicore processors era.  ...  load and store queue entries in superscalar processors.  ... 
doi:10.1145/2355585.2355592 fatcat:3mrp3fyihfgtnitoli35mhmtdy

Architectural differences of efficient sequential and parallel computers

Martti J. Forsell
2002 Journal of systems architecture  
For that purpose we analytically evaluate the performance of eight general purpose processor architectures representing widely both commercial and scientific processor designs in both single processor  ...  the processor architect's point of view.  ...  A superscalar processor is a processor in which superscalar execution is dynamic. The processor decides which instructions are executed in parallel during the execution of a program [20, 25, 43] .  ... 
doi:10.1016/s1383-7621(02)00064-4 fatcat:graadduelvdupc72gz6bpwsyxm

Dynamic resizing of superscalar datapath components for energy efficiency

D. Ponomarev, G. Kucuk, K. Ghose
2006 IEEE transactions on computers  
In this study, we aim to adapt our already-proven method for single-threaded superscalar processors to simultaneous multi-threaded (SMT) processors for energy savings.  ...  Since, the energy consumption of the turned-off datapath resources is quite low, as a result, it becomes possible to have great amount of energy savings within a processor.  ...  of about 5%. • The superscalar study in [2] [3] [4] targets Pentium-III like processors that combines physical register files and reorder buffers in a single structure.  ... 
doi:10.1109/tc.2006.23 fatcat:uo5i5sjwxjeydmz3pqot2x3fya
« Previous Showing results 1 — 15 out of 3,143 results