Sparsity: Optimization Framework for Sparse Matrix Kernels

Eun-Jin Im, Katherine Yelick, Richard Vuduc
2004 The international journal of high performance computing applications  
Sparse matrix-vector multiplication is an important computational kernel that performs poorly on most modern processors due to a low compute-to-memory ratio and irregular memory access patterns. Optimization is difficult because of the complexity of cache-based memory systems and because performance is highly dependent on the nonzero structure of the matrix. The Sparsity system is designed to address these problems by allowing users to automatically build sparse matrix kernels that are tuned to
more » ... their matrices and machines. Sparsity combines traditional techniques such as loop transformations with data structure transformations and optimization heuristics that are specific to sparse matrices. It provides a novel framework for selecting optimization parameters, such as block size, using a combination of performance models and search. In this paper we discuss the optimization of two operations: a sparse matrix times a dense vector and a sparse matrix times a set of dense vectors. Our experience indicates that register level optimizations are effective for matrices arising in certain scientific simulations, in particular finite-element problems. Cache level optimizations are important when the vector used in multiplication is larger than the cache size, especially for matrices in which the nonzero structure is random. For applications involving multiple vectors, reorganizing the computation to perform the entire set of multiplications as a single operation produces significant speedups. We describe the different optimizations and parameter selection *
doi:10.1177/1094342004041296 fatcat:hjien6e4hjg5vnrrqyasjhlyde