Predictive Modeling in a Polyhedral Optimization Space

Eunjung Park, John Cavazos, Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, P. Sadayappan
2013 International journal of parallel programming  
Significant advances in compiler optimization have been made in recent years, enabling many transformations such as tiling, fusion, parallelization and vectorization on imperfectly nested loops. Nevertheless, the problem of finding the best performing combination of loop transformations remains a major challenge. Polyhedral models for compiler optimization have demonstrated their strong potential in increasing the program performance, in particular for compute-intensive applications. But
more » ... g static cost models to select a polyhedral optimization have shown their limitations, and to date iterative compilation has become a sound alternative to these models to uncover the maximal performance. But since the number of polyhedral optimization alternatives can be enormous, it is often impractical to iterate over a significant fraction of the entire space of polyhedral transformed variants. Recent research has focused only on iterating over this search space with manually-constructed heuristics or with expensive search algorithms (e.g., genetic algorithms) that can eventually find good points in the polyhedral space. In this paper, we propose to solve the polyhedral optimization selection problem using machine learning models. We show that these models can quickly find high-performance program variants in the polyhedral space, without resorting to extensive empirical search. We introduce models that take as input a characterization of a program based on its dynamic behavior, and predict the performance of aggressive high-level polyhedral transformations that includes tiling, parallelization and vectorization. We allow for a minimal empirical search on the target machine, discovering on average 83% of the search space optimal combinations in at most 5 runs. Our end-to-end framework is validated on numerous benchmarks and two modern multi-core machines.
doi:10.1007/s10766-013-0241-1 fatcat:x3slz5gqlfdzzhu3jnlyfetjhy