2,134 Hits in 3.3 sec

Efficient Instruction Schedulers for SMT Processors

J.J. Sharkey, D.V. Ponomarev
The Twelfth International Symposium on High-Performance Computer Architecture, 2006.  
We propose dynamic scheduler designs to improve the scheduler scalability and reduce its complexity in the SMT processors.  ...  Our first design is an adaptation of the recently proposed instruction packing to SMT.  ...  We would also like to thank Oguz Ergin for his assistance with the use of Cadence design tools.  ... 
doi:10.1109/hpca.2006.1598137 dblp:conf/hpca/SharkeyP06 fatcat:pehckrui2bbabef7c3ms6diygi

Exploring chip-multiprocessors in deeply-embedded real-time computing

Xuan Qi
2008 ACM SIGBED Review  
I discuss the issues to be addressed such as effective resource modeling, efficient scheduling algorithms, and energy efficient design.  ...  As an energy efficient high-performance architecture, chip multiprocessor (CMP) can be deployed in deeply-embedded real-time computing.  ...  The central idea for CMP/SMT processors to improve system performance and power efficiency is to exploit both instruction-and thread-level parallelisms in the applications.  ... 
doi:10.1145/1366283.1366296 fatcat:p6f3gmhxrvfsncsd43ov6vk7ti

The energy efficiency of CMP vs. SMT for multimedia workloads

Ruchira Sasanka, Sarita V. Adve, Yen-Kuang Chen, Eric Debes
2004 Proceedings of the 18th annual international conference on Supercomputing - ICS '04  
This paper compares the energy efficiency of chip multiprocessing (CMP) and simultaneous multithreading (SMT) on modern out-of-order processors for the increasingly important multimedia applications.  ...  We find that for the design space explored, for each workload, at each performance point, CMP is more energy efficient than SMT.  ...  All threads of an SMT share the L1 data and instruction caches. For CMP, each processor core has its own L1 data and L1 instruction cache, and all cores share an L2 cache through a common bus.  ... 
doi:10.1145/1006209.1006238 dblp:conf/ics/SasankaACD04 fatcat:mojecm2mdjemlot6xxjr52pfme

Operating system exploitation of the POWER5 system

P. Mackerras, T. S. Mathews, R. C. Swanberg
2005 IBM Journal of Research and Development  
In particular, the overheads for synchronizing translation-lookaside buffer (TLB) invalidations between processors, and for ensuring that the instruction cache is kept coherent by software, have been removed  ...  for CPU usage when in the SMT mode.  ...  Acknowledgments David Engebretsen and his team designed and implemented much of the architecture-specific Linux kernel code for supporting shared processors and SMT, and in particular the code to cede  ... 
doi:10.1147/rd.494.0533 fatcat:okawyfaynndb3ip3cxvkfmw6ki

Efficient Transient-Fault Tolerance for Multithreaded Processors Using Dual-Thread Execution

Yi Ma, Huiyang Zhou
2006 Computer Design (ICCD '99), IEEE International Conference on  
In this paper, we propose dual-thread execution (DTE) for SMT processors to efficiently achieve transient-fault tolerance.  ...  In this paper, we apply the same principles as in FTDCE to SMT architectures and explore fetch policies to address the critical resource-sharing issue in SMT architectures.  ...  ACKNOWLEDGMENT We would like to thank the anonymous reviewers for their valuable suggestions to improve the paper.  ... 
doi:10.1109/iccd.2006.4380804 dblp:conf/iccd/MaZ06 fatcat:doihf6dze5b67hu3qwqdn6qdgy

Petri Net Analysis of Non-Redundant and Redundant Execution Schemes

Stefan Einer, Bernhard Fechner, Jörg Keller
2010 FERS-Mitteilungen  
Therefore, scheduling algorithms as simple as Round Robin can be recommended for redundant execution.  ...  This work investigates the influence of different instruction fetch algorithms on the performance of an SMT processor by modeling it with Petri nets.  ...  Instruction scheduling for SMT The instruction fetch phase within a processor works optimal if the following criteria are fulfilled.  ... 
doi:10.1007/bf03345446 fatcat:b6m5cyvxlbaqlg3wkahardxl7e

Thread-Sensitive Instruction Issue for SMT Processors

B. Robatmili, N. Yazdani, S. Sardashti, M. Nourani
2004 IEEE computer architecture letters  
The scheduling complexity and performance of an SMT processor depend on the topology used in the fetch and issue stages.  ...  In this paper, we propose a thread sensitive issue policy for a partitioned SMT processor which is based on a thread metric.  ...  Tullsen et al. proposed their SMT architecture as an extension to Superscalar processors and studied fetch policies for their SMT processor [6] .  ... 
doi:10.1109/l-ca.2004.9 fatcat:r3g3x3yjirgwlc56zfvaft5kea

Exploiting Operand Availability for Efficient Simultaneous Multithreading

Joseph J. Sharkey, Dmitry V. Ponomarev
2007 IEEE transactions on computers  
We propose several schemes to improve the scalability, reduce the complexity and delays, and increase the throughput of dynamic scheduling in SMT processors.  ...  For schedulers with the capacity to hold 64 instructions on a 4-way SMT, the 2OP_BLOCK design outperforms the traditional queue by 14 percent, on average, and at the same time results in a 10 percent reduction  ...  SCHEDULER DESIGNS FOR SMT This section describes our proposed designs to maximize the scheduling efficiency of SMT processors.  ... 
doi:10.1109/tc.2007.28 fatcat:3vuvqhd3pfggvmnrihjb6m6i7a

Architectural Support for Network Applications on Simultaneous MultiThreading Processors

Kyueun Yi, Jean-Luc Gaudiot
2007 2007 IEEE International Parallel and Distributed Processing Symposium  
The goal of this paper is to evaluate the applicability and efficiency of Simultaneous Multi-Threaded (SMT) as a network processor.  ...  Hence, new architectures should be designed for the programmable network processors of the future.  ...  We have proposed and evaluated a packet dependency solution for SMT processors. The proposed strategy consists of packet schedulers and of a Load/Store instruction scheduler.  ... 
doi:10.1109/ipdps.2007.370236 dblp:conf/ipps/YiG07 fatcat:4vropjc5w5bt3gcczn7c5dbqkm

Probabilistic modeling for job symbiosis scheduling on SMT processors

Stijn Eyerman, Lieven Eeckhout
2012 ACM Transactions on Architecture and Code Optimization (TACO)  
a two-thread SMT processor, and an average 19% (and up to 45%) reduction in job turnaround time for a four-thread SMT processor.  ...  Symbiotic job scheduling improves simultaneous multithreading (SMT) processor performance by coscheduling jobs that have "compatible" demands on the processor's shared resources.  ...  ACKNOWLEDGMENTS We thank the reviewers for their constructive and insightful feedback.  ... 
doi:10.1145/2207222.2207223 fatcat:w366yp36sza3tdb2qn6olcs4fq

Simultaneous thin-thread processors for low-power embedded systems

Won W. Ro, Jaeyoung Yi, Joon-Sang Park, Joonseok Park
2008 IEICE Electronics Express  
Therefore, the system would eventually have to turn to advanced processors such as superscalars for the embedded processing cores.  ...  In this paper, we investigate the possibility to use multi-threaded processors to solve the problems with the traditional superscalar processors in embedded systems.  ...  With this forecast, it is crucial to provide more powerful embedded processors which could manage limited battery life efficiently for future applications.  ... 
doi:10.1587/elex.5.802 fatcat:axoa4v4bfje63hogzswwg3shci

Runahead Threads to improve SMT performance

Tanausu Ramirez, Alex Pajuelo, Oliverio J. Santana, Mateo Valero
2008 High-Performance Computer Architecture  
In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded (SMT) processors.  ...  In addition, the proposed mechanism permits register file size reduction of up to 60% in a SMT processor without performance degradation.  ...  We would like to thank Dean Tullsen for his useful suggestions and help on improving this work.  ... 
doi:10.1109/hpca.2008.4658635 dblp:conf/hpca/RamirezPSV08 fatcat:ovcvq3z4h5a6xndteuguds7ma4

Simultaneous multithreading: a platform for next-generation processors

S.J. Eggers, J.S. Emer, H.M. Levy, J.L. Lo, R.L. Stamm, D.M. Tullsen
1997 IEEE Micro  
We also thank Jennifer Anderson of DEC Western Research Laboratory for copies of the SpecFP95 benchmarks, parallelized by the most recent version of the SUIF compiler, and Sujay Parekh for comments on  ...  Acknowledgments We thank John O'Donnell of Equator Technologies, Inc. and Tryggve Fossum of Digital Equipment Corp. for the source to the Alpha AXP version of the Multiflow compiler.  ...  Figure 1c shows how each cycle an SMT processor selects instructions for execution from all threads.  ... 
doi:10.1109/40.621209 fatcat:zmx4yx2flnfazi3b6zdwhavnam

Aggressive Scheduling and Speculation in Multithreaded Architectures: Is it Worth its Salt?

Jason Loew, Dmitry Ponomarev
2008 2008 20th International Symposium on Computer Architecture and High Performance Computing  
This paper investigates and quantifies the impacts of several aggressive performance-boosting techniques designed for superscalar processors on the performance of SMT architectures.  ...  Finally, we consider the impact of pipelining instruction scheduling logic over two cycles.  ...  Impact of Pipelining the Instruction Scheduling Logic on SMT In this section we examine the impact of pipelining the scheduling logic (wakeup and selection) over two cycleson an SMT processor.  ... 
doi:10.1109/sbac-pad.2008.15 dblp:conf/sbac-pad/LoewP08 fatcat:3eifhxntbndl7e2yofe5jx6niq

Symbiotic jobscheduling with priorities for a simultaneous multithreading processor

Allan Snavely, Dean M. Tullsen, Geoff Voelker
2002 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '02  
This paper demonstrates that a scheduler for an SMT machine can both satisfy process priorities and symbiotically schedule low and high priority threads to increase system throughput.  ...  Using detailed simulation of an SMT architecture, we introduce and evaluate a series of five software and hardware-assisted priority schedulers.  ...  For a single-threaded processor, those two guarantees are nearly identical. However, on an SMT machine, they are not.  ... 
doi:10.1145/511334.511343 dblp:conf/sigmetrics/SnavelyTV02 fatcat:w43kfzdrijg7fnnqm5v2j4fvty
« Previous Showing results 1 — 15 out of 2,134 results