DataScalar architectures

Doug Burger, Stefanos Kaxiras, James R. Goodman
1997 Proceedings of the 24th annual international symposium on Computer architecture - ISCA '97  
DataScalar architectures improve memory system performance by running computation redundantly across multiple processors, which are each tightly coupled with an associated memory. The program data set (and/or text) is distributed across these memories. In this execution model, each processor broadcasts operands it loads from its local memory to all other units. In this paper, we describe the benefits, costs, and problems associated with the DataScalar model. We also present simulation results
more » ... one possible implementation of a DataScalar system. In our simulated implementation, six unmodified SPEC95 binaries ran from 7% slower to 50% faster on two nodes, and from 9% to 100% faster on four nodes, than on a system with a comparable, more traditional memory system. Our intuition and results show that DataScalar architectures work best with codes for which traditional parallelization techniques fail. We conclude with a discussion of how DataScalar systems may accommodate traditional parallel processing, thus improving performance over a much wider range of applications than is currently possible with either model. 1. Except when the data are cached, in which case the cache line is updated, and no write-through or write-back is required.
doi:10.1145/264107.264215 dblp:conf/isca/BurgerKG97 fatcat:uqpa6bqnoneopjamcnstnmp3fi