Simultaneous multithreaded vector architecture: merging ILP and DLP for high performance

R. Espasa, M. Valero
Proceedings Fourth International Conference on High-Performance Computing  
h t t p: // w w w. ac . u pc . es/ h pc The goal of this p a p e r is to show that instruction level parallelism (ILP) and data-level parallelism (DLP) can be merged i n a single simultaneous vector multithreaded architecture t o execute regular vectorizable code at a performance level that can not be achieved using either paradigm on its own. We will show that the combination of the two techniques yields very high performance at a low cost and a low complexity: We will show that this
more » ... re achieves a sustained performance on numerical regular codes that is 20 times the performance that can be achieved with today's superscalar microprocessors. Moreover, we will show that the architecture can tolerate very large memory latencies, of up t o a 100 cycles, with a relatively small performance degradation. This high performance is independent of working set size or of locality considerations, since the DLP paradigm allows very eficient exploitation of a high performance flat memory bandwidth.
doi:10.1109/hipc.1997.634514 dblp:conf/hipc/EspasaV97 fatcat:hv4xaa5xpbga7fknram3tjspsy