Challenges in the automatic parallelization of large-scale computational applications
Commercial Applications for High-Performance Computing
Application test suites used in the development of parallelizing compilers typically include single-le programs and algorithm kernels. The challenges posed by full-scale commercial applications are rarely addressed. It is often assumed that automatic parallelization is not feasible in the presence of large, realistic programs. In this paper, we reveal some of the hurdles that must be crossed in order to enable these compilers to apply parallelization techniques to large-scale codes. We use a
... chmark suite that has been speci cally designed to exhibit the computing needs found in industry. The benchmarks are provided by the High Performance Group of the Standard Performance Evaluation Corporation SPEC. They consist of a seismic processing application and a quantum level molecular simulation. Both applications exist in a serial and a parallel variant. The parallel variants are hand-parallelized with shared-memory directives either at the largest level of granularity o r i n a h ybrid manner where MPI is used at the largest level of granularity and OpenMP directives are used at a lower level. In our studies we compare the parallel variants with the automatically parallelized, serial codes. We use the Polaris parallelizing compiler, which takes Fortran codes and inserts OpenMP directives around loops determined to be dependence-free. Polaris also reports the reasons why it assumes that a loop is parallel. We h a ve found ve c hallenges faced by an automatic parallelizing compiler when dealing with full applications: modularity, legacy optimizations, symbolic analysis, array reshaping, and issues arising from input output operations. The results of this work will be used to equip parallelizing compilers with the necessary capabilities for handling commercially relevant science and engineering applications.