Compiling stencils in high performance Fortran

Gerald Roth, John Mellor-Crummey, Ken Kennedy, R. Gregg Brickner
1997 Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '97  
For many F ortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of k ernels kno wn as stencils. Stencil computations are commonly used in solving partial di eren tial equations, image processing, and geometric modeling. The e cien t handling of such stencils is critical for achieving high performance on distributed-memory mac hines. Compiling stencils into e cient c o d e i s v i e w ed as so important that some
more » ... s have built special-purpose compilers for handling them and others ha ve added stencilrecognizers to existing compilers. In this paper we present a general compilation strategy for stencils written using Fortran90 array constructs. Our strategy is capable of optimizing single or m ultistatement stencils and is applicable to stencils speci ed with shift intrinsics or with array-syntax all equally well. The strategy eliminates the need for pattern-recognition algorithms by o r c hestrating a set of optimizations that address the overhead of both intraprocessor and interprocessor data movement that results from the translation of Fortran90 array constructs. Our experimen tal results show that code produced by this strategy beats or matches the best code produced by the special-purpose compilers or pattern-recognition schemes that are known to us. In addition, our strategy produces highly optimized code in situations where the others fail, producing several orders of magnitude performance impro vement, and thus provides a stencil compilation strategy that is more robust than its predecessors.
doi:10.1145/509593.509605 dblp:conf/sc/RothMKB97 fatcat:f5vd27hdvvhu7cdbvay6y57tym