Position-dependent arrays and their application for high performance code generation

Federico Pizzuti, Michel Steuwer, Christophe Dubach
2019 Proceedings of the 8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing - FHPNC 2019  
Modern parallel hardware promises unprecedented performance, for the gifted few experts who can program it correctly. Code generators from high-level languages provide an attractive alternative, promising to deliver high performance automatically. Existing projects such as Accelerate, Futhark, Halide, or Lift show that this approach is feasible. Unfortunately, existing efforts focus on computations over tensors: regularly shaped higher dimensional arrays. This limits the expressiveness of these
more » ... approaches and excludes many interesting data structures that are commonly encoded manually in memory, such as trees or triangular matrices. This paper presents an extended array type that lifts this restriction. For multidimensional arrays, the size of a nested array might depend on its position in the surrounding arrays, which enables the expression of computations over less regularly shaped data structures. However, these positiondependent arrays bring new challenges for high-performance code generation, as determining the position of the elements in memory becomes more challenging. This paper shows how these challenges are addressed by extending the existing Lift type system and compiler. The experimental results show that this approach enables the efficient code generation of triangular matrix-vector multiplication, with performance improvements over cuBLAS on an Nvidia GPU by up to 2×. Furthermore, we show a use case for a low-level optimization for avoiding unnecessary out-ofbound checks in stencils, leading to up to 3× improvements over already optimized generated stencil codes.
doi:10.1145/3331553.3342614 dblp:conf/icfp/PizzutiSD19 fatcat:smkgmq3zmncobcmu5s54aeq2iq