Model-Guided Empirical Optimization for Multimedia Extension Architectures: A Case Study

Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline Chame, Mary Hall
2007 2007 IEEE International Parallel and Distributed Processing Symposium  
Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers, multi-level caches and TLB. In this paper, we describe a compiler that combines optimization across all levels of the memory hierarchy with automatic generation of SIMD code for multimedia extensions. At the high-level, model-guided empirical optimization is used to transform code to optimize for all levels of the
more » ... y hierarchy. This compiler interacts with a backend compiler exploiting superword-level parallelism that takes sequential code as input and produces SIMD code. This paper discusses how we have combined these technologies into a single framework. Through a case study with matrix multiply, we observe performance results that outperform the hand-tuned Intel MKL library, and achieve performance that is within 4% of the ATLAS self-tuning library with architectural defaults and more than 4X faster than the native Intel compiler.
doi:10.1109/ipdps.2007.370641 dblp:conf/ipps/ChenSKCH07 fatcat:awdgu3nk45dwdcvxoejdavsfey