A Decoupled KILO-Instruction Processor

M. Pericas, A. Cristal, R. Gonzalez, D.A. Jimenez, M. Valero
The Twelfth International Symposium on High-Performance Computer Architecture, 2006.  
Building processors with large instruction windows has been proposed as a mechanism for overcoming the memory wall, but finding a feasible and implementable design has been an elusive goal. Traditional processors are composed of structures that do not scale to large instruction windows because of timing and power constraints. However, the behavior of programs executed with large instruction windows gives rise to a natural and simple alternative to scaling. We characterize this phenomenon of
more » ... ution locality and propose a microarchitecture to exploit it to achieve the benefit of a large instruction window processor with low implementation cost. Execution locality is the tendency of instructions to exhibit high or low latency based on their dependence on memory operations. In this paper we propose a decoupled microarchitecture that executes low latency instructions on a Cache Processor and high latency instructions on a Memory Processor. We demonstrate that such a design, using small structures and many in-order components, can achieve the same performance as much more aggressive proposals while minimizing design complexity.
doi:10.1109/hpca.2006.1598112 dblp:conf/hpca/PericasCGJV06 fatcat:hadimequkjfhnbx3gayqhd7nem