Filters








1,319 Hits in 5.0 sec

Reducing the Scheduling Critical Cycle Using Wakeup Prediction

T.E. Ehrhart, S.J. Patel
10th International Symposium on High Performance Computer Architecture (HPCA'04)  
This form of self-scheduling reduces the critical cycle by eliminating the wakeup logic at the expense of additional replays.  ...  This idea is used to build a machine where wakeup times are predicted, and instructions executed too early are replayed.  ...  Acknowledgements The authors thank the other members of the Advanced Computing Systems group as well the anonymous referees for providing feedback during various stages of this work.  ... 
doi:10.1109/hpca.2004.10016 dblp:conf/hpca/EhrhartP04 fatcat:wct4j3wejnbmtogxnxwpcc7bmi

Instruction Recirculation: Eliminating Counting Logic in Wakeup-Free Schedulers [chapter]

Joseph J. Sharkey, Dmitry V. Ponomarev
2005 Lecture Notes in Computer Science  
Such wakeup-free scheduling techniques remove the wakeup delay from the critical path, but incur other forms of complexity, essentially stemming from the need to keep track of the cycle when each physical  ...  This complexity reduction is also accompanied by 3.6% IPC improvement over the state-of-the-art wakeup-free scheduler.  ...  Acknowledgements We would like to thank Matt Yourst for help with the microarchitectural simulation environment.  ... 
doi:10.1007/11549468_62 fatcat:e3qbv2h43zfnnchukohheafoxm

On reducing misspeculations in a pipelined scheduler

R. Gran, E. Morancho, A. Olive, J.M. Llaberia
2009 2009 IEEE International Symposium on Parallel & Distributed Processing  
On average, DLS reduces the number of misspeculated instructions with respect to a speculative scheduler by 17.9%. From the IPC point of view, the speculative scheduler outperforms DLS by 0.3%.  ...  In this work we introduce a non-speculative mechanism named Dependence Level Scheduler (DLS) which not only tolerates the scheduling-logic latency but also reduces the number of misspeculated instructions  ...  Wakeup latency is addressed by reducing the load capacitance of the wakeup tag bus [8][12][18] or by using index-based wakeup [5][6][12][14][21] [29] Scheduling Cycle Duration bzip2 crafty eon gap  ... 
doi:10.1109/ipdps.2009.5160990 dblp:conf/ipps/TejeroMOL09 fatcat:3f5hjoxtg5b6rjfqmkscrkdv4e

Improving Scalability and Complexity of Dynamic Scheduler through Wakeup-based Scheduling

Kuo-Su Hsiao, Chung-Ho Chen
2006 Computer Design (ICCD '99), IEEE International Conference on  
The other segments are excluded from the wakeup operation to reduce the useless wakeup activities.  ...  The experimental results show that the proposed technique saves 50-61% of the power consumption, reduces 42-76% in the wakeup latency compared to the conventional design.  ...  Acknowledgments This work was supported in part by the National Science Council, Taiwan under Grant No. NSC 94-2220-E-006-008.  ... 
doi:10.1109/iccd.2006.4380817 dblp:conf/iccd/HsiaoC06 fatcat:p2undxnryzfurlmho7xosly2xm

Half-price architecture

Ilhyun Kim, Mikko H. Lipasti
2003 SIGARCH Computer Architecture News  
Two techniques are proposed and evaluated: one for the wakeup logic is sequential wakeup, which decouples half of the tag matching logic from the wakeup bus to reduce the load capacitance of the bus.  ...  Handling two source operands requires multiple ports for each instruction in structures--such as the register file and wakeup logic--which are often in the processor's critical timing paths.  ...  We would also like to thank the anonymous reviewers for their many valuable comments. References  ... 
doi:10.1145/871656.859623 fatcat:ewgcpoergrgijgup3viu7zkksq

Half-price architecture

Ilhyun Kim, Mikko H. Lipasti
2003 Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03  
Two techniques are proposed and evaluated: one for the wakeup logic is sequential wakeup, which decouples half of the tag matching logic from the wakeup bus to reduce the load capacitance of the bus.  ...  Handling two source operands requires multiple ports for each instruction in structures--such as the register file and wakeup logic--which are often in the processor's critical timing paths.  ...  We would also like to thank the anonymous reviewers for their many valuable comments. References  ... 
doi:10.1145/859618.859623 fatcat:gyuky6ao5zahndqdpamuerka7m

Half-price architecture

Ilhyun Kim, Mikko H. Lipasti
2003 Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03  
Two techniques are proposed and evaluated: one for the wakeup logic is sequential wakeup, which decouples half of the tag matching logic from the wakeup bus to reduce the load capacitance of the bus.  ...  Handling two source operands requires multiple ports for each instruction in structures--such as the register file and wakeup logic--which are often in the processor's critical timing paths.  ...  We would also like to thank the anonymous reviewers for their many valuable comments. References  ... 
doi:10.1145/859622.859623 fatcat:3sj5pmh5bzhazi7b5ygbkdqbum

Wakeup scheduling for energy-efficient communication in opportunistic mobile networks

Wei Gao, Qinghua Li
2013 2013 Proceedings IEEE INFOCOM  
In this paper, we propose novel techniques to adaptively schedule wakeup periods of mobile nodes between their inter-contact times.  ...  Periodic contact probing is required to facilitate opportunistic communication, but seriously reduces the limited battery life of mobile devices.  ...  We first evaluate the accuracy of contact prediction using the percentage of missing contacts due to wakeup scheduling.  ... 
doi:10.1109/infcom.2013.6567007 dblp:conf/infocom/GaoL13 fatcat:zie2gtwoa5cppcv45reqfeqqti

Non-uniform Instruction Scheduling [chapter]

Joseph J. Sharkey, Dmitry V. Ponomarev
2005 Lecture Notes in Computer Science  
We then propose a Non-Uniform Scheduler -a design that partitions the scheduling logic into two queues, each with dedicated wakeup and selection logic: a small Fast Issue Queue (FIQ) to issue critical  ...  Dynamic instruction scheduling logic is one of the most critical and cycle-limiting structures in modern superscalar processors, and it is not easily pipelined without significant losses in performance  ...  We would also like to thank Kanad Ghose and Deniz Balkan for useful comments on earlier drafts of this paper.  ... 
doi:10.1007/11549468_61 fatcat:zcnbq2fclfehfdz5tfhn6wikei

SEED

Francisco J. Mesa-Martínez, Michael C. Huang, Jose Renau
2006 Proceedings of the 15th international conference on Parallel architectures and compilation techniques - PACT '06  
Conventional designs rely on atomic wakeup-select cycles to ensure compact scheduling.  ...  wakeup block feeding an in-order scheduler.  ...  This removes the select logic from the timing-critical process of scheduling dependent instructions to execute back-to-back.  ... 
doi:10.1145/1152154.1152193 dblp:conf/IEEEpact/Mesa-MartinezHR06 fatcat:6j7xgjn7nzdwpdzynt7vu5ywne

On pipelining dynamic instruction scheduling logic

Jared Stark, Mary D. Brown, Yale N. Patt
2000 Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture - MICRO 33  
the ability to execute dependent instructions in consecutive cycles, and within 2% of the IPC of a conventional machine that uses single cycle scheduling logic.  ...  This paper offers a third, acceptable, alternative: pipelined scheduling with speculative wakeup.  ...  Acknowledgements We especially thank Paul Racunas, Jean-Loup Baer, and the anonymous referees for their comments on earlier drafts of this paper.  ... 
doi:10.1145/360128.360136 fatcat:ph4adzod7rfw3fey2hvhl7wule

Matrix scheduler reloaded

Peter G. Sassone, Jeff Rupley, Edward Brekelbaum, Gabriel H. Loh, Bryan Black
2007 SIGARCH Computer Architecture News  
This technique can be used to create quicker isoperformance schedulers (17-58% reduced critical path) or larger isotiming schedulers (7-26% IPC increase).  ...  Both are based on the simple observation that the wakeup and picker matrices are sparse, even at small sizes; thus small indirection tables can be used to greatly reduce their width and latency.  ...  These small IPC losses shown are offsets against the reduced critical path distance through the wakeup matrix.  ... 
doi:10.1145/1273440.1250704 fatcat:ynlvelf3cvbhppr4dgmg3xe3xq

Matrix scheduler reloaded

Peter G. Sassone, Jeff Rupley, Edward Brekelbaum, Gabriel H. Loh, Bryan Black
2007 Proceedings of the 34th annual international symposium on Computer architecture - ISCA '07  
This technique can be used to create quicker isoperformance schedulers (17-58% reduced critical path) or larger isotiming schedulers (7-26% IPC increase).  ...  Both are based on the simple observation that the wakeup and picker matrices are sparse, even at small sizes; thus small indirection tables can be used to greatly reduce their width and latency.  ...  These small IPC losses shown are offsets against the reduced critical path distance through the wakeup matrix.  ... 
doi:10.1145/1250662.1250704 dblp:conf/isca/SassoneRBLB07 fatcat:iaeii6y74nf3parav7acbne2we

Efficient dynamic scheduling through tag elimination

Dan Ernst, Todd Austin
2002 SIGARCH Computer Architecture News  
By putting these instructions into specialized windows with fewer tag comparators, load capacitance on the scheduler critical path can be reduced, with only very small effects on program throughput.  ...  An increasingly large portion of scheduler latency is derived from the monolithic content addressable memory (CAM) arrays accessed during instruction wakeup.  ...  We also thank all of the reviewers and our collegues for their insights and suggestions for strengthening our paper.  ... 
doi:10.1145/545214.545221 fatcat:ofgyx745knamjgtmwrzuo2r2wi

An efficient wakeup design for energy reduction in high-performance superscalar processors

Kuo-Su Hsiao, Chung-Ho Chen
2005 Proceedings of the 2nd conference on Computing frontiers - CF '05  
In modern superscalar processors, the complex instruction scheduler could form the critical path of the pipeline stages and limit the clock cycle time.  ...  In speed, the proposed design reduces an average of 77% in the wakeup latency compared to the conventional CAMbased design and an average of 33% reduction of the latency of the bit-map RAM design.  ...  Ernst and et al. also proposed a Cyclone scheduler that predicts the operand arrival time and schedules instructions in a countdown cyclic queue.  ... 
doi:10.1145/1062261.1062319 dblp:conf/cf/HsiaoC05 fatcat:ghumbkx6mnhw3mmnxpqe35e67q
« Previous Showing results 1 — 15 out of 1,319 results