When polyhedral transformations meet SIMD code generation

Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noël Pouchet, P. Sadayappan
2013 Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation - PLDI '13  
Data locality and parallelism are critical optimization objectives for performance on modern multi-core machines. Both coarse-grain parallelism (e.g., multi-core) and fine-grain parallelism (e.g., vector SIMD) must be effectively exploited, but despite decades of progress at both ends, current compiler optimization schemes that attempt to address data locality and both kinds of parallelism often fail at one of the three objectives. We address this problem by proposing a 3-step framework, which
more » ... ims for integrated data locality, multi-core parallelism and SIMD execution of programs. We define the concept of vectorizable codelets, with properties tailored to achieve effective SIMD code generation for the codelets. We leverage the power of a modern high-level transformation framework to restructure a program to expose good ISA-independent vectorizable codelets, exploiting multi-dimensional data reuse. Then, we generate ISA-specific customized code for the codelets, using a collection of lower-level SIMD-focused optimizations. We demonstrate our approach on a collection of numerical kernels that we automatically tile, parallelize and vectorize, exhibiting significant performance improvements over existing compilers.
doi:10.1145/2491956.2462187 dblp:conf/pldi/KongVSFPS13 fatcat:2cqpl2tqobcjvbdlpoarmfbx3u