On Parallel Solution of Sparse Triangular Linear Systems in CUDA [article]

Ruipeng Li
2017 arXiv   pre-print
The acceleration of sparse matrix computations on modern many-core processors, such as the graphics processing units (GPUs), has been recognized and studied over a decade. Significant performance enhancements have been achieved for many sparse matrix computational kernels such as sparse matrix-vector products and sparse matrix-matrix products. Solving linear systems with sparse triangular structured matrices is another important sparse kernel as demanded by a variety of scientific and
more » ... g applications such as sparse linear solvers. However, the development of efficient parallel algorithms in CUDA for solving sparse triangular linear systems remains a challenging task due to the inherently sequential nature of the computation. In this paper, we will revisit this problem by reviewing the existing level-scheduling methods and proposing algorithms with self-scheduling techniques. Numerical results have indicated that the CUDA implementations of the proposed algorithms can outperform the state-of-the-art solvers in cuSPARSE by a factor of up to 2.6 for structured model problems and general sparse matrices.
arXiv:1710.04985v1 fatcat:sydqdpsosjblbc64itdmjlkpde