Evaluation of architectural paradigms for addressing theprocessor-memory gap
Many high performance applications run well below the peak arithmetic performance of the underlying machine, with inefficiencies often attributed to poor memory system behavior. In the context of scientific computing we examine three emerging processors designed to address the wellknown gap between processor and memory performance through the exploitation of data parallelism. The VIRAM architecture uses novel PIM technology to combine embedded DRAM with a vector co-processor for exploiting its
... for exploiting its large bandwidth potential. The DIVA architecture incorporates a collection of PIM chips as smart-memory coprocessors to a conventional microprocessor, and relies on superword-level parallelism to make effective use of the available memory bandwidth. The Imagine architecture provides a stream-aware memory hierarchy to support the tremendous processing potential of SIMD controlled VLIW clusters. First we develop a scalable synthetic probe that allows us to parametize key performance attributes of VIRAM, DIVA and Imagine while capturing the performance crossover points of these architectures. Next we present results for scientific kernels with different sets of computational characteristics and memory access patterns. Our experiments allow us to evaluate the strategies employed to exploit data parallelism, isolate the set of application characteristics best suited to each architecture and show a promising direction towards interfacing leading-edge processor technology with high-end scientific computations.