Toward kilo-instruction processors

Adrián Cristal, Oliverio J. Santana, Mateo Valero, José F. Martínez
2004 ACM Transactions on Architecture and Code Optimization (TACO)  
The continuously increasing gap between processor and memory speeds is a serious limitation to the performance achievable by future microprocessors. Currently, processors tolerate long-latency memory operations largely by maintaining a high number of in-flight instructions. In the future, this may require supporting many hundreds, or even thousands, of in-flight instructions. Unfortunately, the traditional approach of scaling up critical processor structures to provide such support is
more » ... l at these levels, due to area, power, and cycle time constraints. In this paper we show that, in order to overcome this resource-scalability problem, the way in which critical processor resources are managed must be changed. Instead of simply upsizing the processor structures, we propose a smarter use of the available resources, supported by a selective checkpointing mechanism. This mechanism allows instructions to commit out of order, and makes a reorder buffer unnecessary. We present a set of techniques such as multilevel instruction queues, late allocation and early release of registers, and early release of load/store queue entries. All together, these techniques constitute what we call a kilo-instruction processor, an architecture that can support thousands of in-flight instructions, and thus may achieve high performance even in the presence of large memory access latencies. A. Cristal et al.
doi:10.1145/1044823.1044825 fatcat:efu5hwogwffdnmub66gwp7g5xu