Filters








1,353 Hits in 3.0 sec

Checkpointed early load retirement

N. Kirman, M. Kirman, M. Chaudhuri, J.F. Martinez
2005 11th International Symposium on High-Performance Computer Architecture  
To attack this problem, we propose checkpointed early load retirement, a mechanism that combines register checkpointing and back-end-i.e., at retirement-load-value prediction.  ...  (3) EARly-retiring the long-latency load.  ...  CHECKPOINTED EARLY LOAD RETIREMENT The goal of checkpointed early load retirement is to tolerate long-latency loads by (1) retiring them early, so as to not block the flow of instructions through the processor  ... 
doi:10.1109/hpca.2005.9 dblp:conf/hpca/KirmanKCM05 fatcat:jupfyzt6rvdsblvccvle5hqudu

An analysis of a resource efficient checkpoint architecture

Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan
2004 ACM Transactions on Architecture and Code Optimization (TACO)  
for fast store-load forwarding, and an effective algorithm for aggressive physical register reclamation.  ...  This paper proposes a novel checkpoint processing and recovery (CPR) microarchitecture, and shows how to implement a large instruction window processor without requiring large structures thus permitting  ...  Early resource reclamation is limited to a subset of the ROB.  ... 
doi:10.1145/1044823.1044826 fatcat:w2ny6zsljncvjbuliobproywce

An Efficient Low-Complexity Alternative to the ROB for Out-of-Order Retirement of Instructions

Salvador Petit, Rafael Ubal, Julio Sahuquillo, Pedro Lopez, Jose Duato
2009 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools  
In this paper, a checkpoint-free out-of-order commit architecture is proposed, which replaces the ROB with a small structure called Validation Buffer (VB) from which instructions are retired as soon as  ...  Instructions are retired from this structure in program order, which may lead to significant performance degradation if a long latency operation blocks the ROB head.  ...  In [9] , Kirman et al propose the checkpointed early load retirement mechanism which has certain similarities with the previous proposal.  ... 
doi:10.1109/dsd.2009.237 dblp:conf/dsd/PetitUSLD09 fatcat:2v7xdzibdbbxxpzkf2whf4lmlu

InvisiFence

Colin Blundell, Milo M.K. Martin, Thomas F. Wenisch
2009 Proceedings of the 36th annual international symposium on Computer architecture - ISCA '09  
A multiprocessor's memory consistency model imposes ordering constraints among loads, stores, atomic operations, and memory fences.  ...  Even for consistency models that relax ordering among loads and stores, ordering constraints still induce significant performance penalties due to atomic operations and memory ordering fences.  ...  INVISIFENCE joins the growing body of work (e.g., checkpointed early load retirement, speculative compiler optimizations, speculative locking, and best-effort transactional memory) that exploits similar  ... 
doi:10.1145/1555754.1555785 dblp:conf/isca/BlundellMW09 fatcat:etlxqff25zhutlw2dqpbkfc7uu

CAVA: Hiding L2 Misses with Checkpoint-Assisted Value Prediction

L. Ceze, K. Strauss, J. Tuck, J. Renau, J. Torrellas
2004 IEEE computer architecture letters  
Load misses in on-chip L2 caches often end up stalling modern superscalars. To address this problem, we propose hiding L2 misses with Checkpoint-Assisted VAlue prediction (CAVA).  ...  If the missing load reaches the head of the reorder buffer before the requested data is received from memory, the processor checkpoints, consumes the predicted value, and speculatively continues execution  ...  [8] proposed Cherry, where resources are recycled early by leveraging checkpoints. Akkary et al. [1] and Cristal et al.  ... 
doi:10.1109/l-ca.2004.3 fatcat:5w7fzbv6rzaq3n5w5acvpj5tsm

CAVA

Luis Ceze, Karin Strauss, James Tuck, Josep Torrellas, Jose Renau
2006 ACM Transactions on Architecture and Code Optimization (TACO)  
When the missing load finally reaches the head of the ROB, the processor checkpoints its state, retires the load, and speculatively uses the predicted value and continues execution.  ...  Specifically, Runahead also uses checkpointing to allow processors to retire missing loads and continue execution. However, Runahead and CAVA differ in three major ways.  ...  Consequently, CAVA checkpoints at the load retirement.  ... 
doi:10.1145/1138035.1138038 fatcat:b7gutp6sjndbxfjk4v4mckejme

Long-latency branches

Abhas Kumar, Nisheet Jain, Mainak Chaudhuri
2006 SIGARCH Computer Architecture News  
Architectures that allow checkpoint-assisted speculative load retirement fetch a large number of branches belonging to the dependence chains of the speculatively retired loads.  ...  Fetched branches belonging to the dependence chains of loads that miss in the L1 data cache exhibit very high misprediction penalty due to the delay in the execution resulting from unavailability of operands  ...  CLEAR augments load-value prediction with early retirement.  ... 
doi:10.1145/1152394.1152396 fatcat:n3rjcns5czdynplilej6vqmxla

An Energy Efficient Instruction Window for Scalable Processor Architecture

M. CHOI, S. MAENG
2008 IEICE transactions on electronics  
First, the small reorder buffer (SROB) reduces power dissipation by deferred allocation and early release.  ...  The early load retirement [4] mechanism combines register checkpointing and early release of the load instruction. This allows instructions dependent on the long-latency load to execute sooner.  ...  When the instruction window is blocked by the long-latency instruction, the architectural register state is checkpointed and retires early.  ... 
doi:10.1093/ietele/e91-c.9.1427 fatcat:2f63nye4kzct7hdx536a4jb3ki

Toward kilo-instruction processors

Adrián Cristal, Oliverio J. Santana, Mateo Valero, José F. Martínez
2004 ACM Transactions on Architecture and Code Optimization (TACO)  
We present a set of techniques such as multilevel instruction queues, late allocation and early release of registers, and early release of load/store queue entries.  ...  Instead of simply upsizing the processor structures, we propose a smarter use of the available resources, supported by a selective checkpointing mechanism.  ...  These checkpoints are useful to early, release instructions in the ROB, to release physical registers early, and to remove load instructions early from the load/store queue.  ... 
doi:10.1145/1044823.1044825 fatcat:efu5hwogwffdnmub66gwp7g5xu

A comparative performance evaluation of various state maintenance mechanisms

Michael Butler, Yale Patt
1993 Proceedings of the 26th Annual International Symposium on Microarchitecture  
In this study we examine the performance implications of three common state maintenance mechanisms: the reorder buffer, the history bufler, and checkpointing.  ...  Similarly, retiring a checkpoint that is no longer needed simply requires clearing the corresponding bit in these bit fields.  ...  Previous loads from unknown addresses will stall all subsequent stores until the load address has been resolved, and previous stores to unknown addresses will stall subsequent loads and stores.  ... 
doi:10.1109/micro.1993.282743 dblp:conf/micro/ButlerP93 fatcat:rh3ooyurovg5tfbk7oavvcmkbm

Dual-core execution: building a highly scalable single-thread instruction window

Huiyang Zhou
2005 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)  
The front processor fetches and preprocesses instruction streams and retires processed instructions into the queue for the back processor to consume.  ...  The front processor executes instructions as usual except for cache-missing loads, which produce an invalid value instead of blocking the pipeline.  ...  Early branch resolution is also proposed in out-of-order commit processors using checkpointing and early release of ROB entries [11] , [12] .  ... 
doi:10.1109/pact.2005.18 dblp:conf/IEEEpact/Zhou05 fatcat:trbkqaflhndydpqebxkimzj3tm

A Sequentially Consistent Multiprocessor Architecture for Out-of-Order Retirement of Instructions

Rafael Ubal, Julio Sahuquillo, Salvador Petit, Pedro Lopez, David Kaeli
2012 IEEE Transactions on Parallel and Distributed Systems  
Based on the Validation Buffer (VB) architecture (a previously proposed out-of-order retirement, checkpoint-free architecture for single processors), this paper proposes a cost-effective, scalable, out-of-order  ...  Expanding the width of the instruction window can be highly beneficial to multiprocessors that implement a strict memory model, especially when both loads and stores encounter long latencies due to cache  ...  In [17] , speculative retirement of loads is improved by also retiring stores before their speculative state is confirmed.  ... 
doi:10.1109/tpds.2011.255 fatcat:dbuvqyeu7ndhbludegyawcxcdy

Physical Register Inlining

Mikko H. Lipasti, Brian R. Mestan, Erika Gunadi
2004 SIGARCH Computer Architecture News  
Each map checkpoint increments the reference count of each physical register that it points to. As map checkpoints are retired, these reference counts are decremented.  ...  p7] and p2<= p3 & p4 add p5<= p1 + p2 1) This load misses the cache and delays wakeup of the dependent add 2) This add retires and finds p2 to be a narrow value, updates the map, and frees  ... 
doi:10.1145/1028176.1006728 fatcat:mkxmimb4h5gnxiaowo4by6exfy

Continual flow pipelines

Srikanth T. Srinivasan, Ravi Rajwar, Haitham Akkary, Amit Gandhi, Mike Upton
2004 SIGPLAN notices  
Under CFP, registers held by a checkpoint cannot be released until the checkpoint retires.  ...  Avg. # of registers held by checkpoints Benchmark Suite Avg. SDB size when occupied % retired inst.  ... 
doi:10.1145/1037187.1024407 fatcat:66u2ds4as5dlpjdvgpmevakaoi

Continual flow pipelines

Srikanth T. Srinivasan, Ravi Rajwar, Haitham Akkary, Amit Gandhi, Mike Upton
2004 ACM SIGOPS Operating Systems Review  
Under CFP, registers held by a checkpoint cannot be released until the checkpoint retires.  ...  Avg. # of registers held by checkpoints Benchmark Suite Avg. SDB size when occupied % retired inst.  ... 
doi:10.1145/1037949.1024407 fatcat:ezo33dyqhrfnffhx6lo3ao2x3e
« Previous Showing results 1 — 15 out of 1,353 results