Using internal redundant representations and limited bypass to support pipelined adders and register files

M.D. Brown, Y.N. Patt
Proceedings Eighth International Symposium on High Performance Computer Architecture  
This paper evaluates the use of redundant binary and pipelined 2's complement adders in out-of-order execution cores. Redundant binary adders reduce the ADD latency to less than half that of traditional 2's complement adders, allowing higher core clock frequencies and greater execution bandwidth (in instructions per second). Pipelined 2's complement adders allow a higher clock frequency, but do not reduce the ADD latency. Machines with redundant binary adders are compared to machines with 2's
more » ... mplement adders and the same execution bandwidth and bypass network complexity. Results show that on the SPECint95 benchmarks, the average IPC of an 8-wide machine with 1cycle redundant binary adders is 9% higher than a machine using 2-cycle pipelined adders. Pipelined functional units and multi-cycle register files may require multi-level bypass networks to guarantee that an instruction's result is available any cycle after it is produced. Multi-level bypass networks require large fan-in input muxes that increase cycle time. This paper shows that one level of bypass paths in a multi-level bypass network can be removed while still achieving within 3% to 1% of the IPC of a machine with a full bypass network.
doi:10.1109/hpca.2002.995718 dblp:conf/hpca/BrownP02 fatcat:petjpr522jblpp2jahwyhon65e