Parallel preconditioned conjugate gradient algorithm on GPU

Rudi Helfenstein, Jonas Koko
2012 Journal of Computational and Applied Mathematics  
We propose a parallel implementation of the Preconditioned Conjugate Gradient algorithm on a GPU platform. The preconditioning matrix is an approximate inverse derived from the SSOR preconditioner. Used through sparse matrix-vector multiplication, the proposed preconditioner is well suited for the massively parallel GPU architecture. As compared to CPU implementation of the conjugate gradient algorithm, our GPU preconditioned conjugate gradient implementation is up to 10 times faster (8 times faster at worst).
doi:10.1016/j.cam.2011.04.025 fatcat:celrnv3bavccfmitrnxg63ji5e