Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms : Parallel Solvers and Preconditioners

Dimitar Lukarski
2012
Partial dierential equations are typically solved by means of nite dierence, nite volume or nite element methods resulting in large, highly coupled, ill-conditioned and sparse (non-)linear systems. In order to minimize the computing time we want to exploit the capabilities of modern parallel architectures. The rapid hardware shifts from single core to multi-core and many-core processors lead to a gap in the progression of algorithms and programming environments for these platforms the parallel
more » ... odels for large clusters do not fully utilize the performance capability of the multi-core CPUs and especially of the GPUs. Software stack needs to run adequately on the next generation of computing devices in order to exploit the potential of these new systems. Moving numerical software from one platform to another becomes an important task since every parallel device has its own programming model and language. The greatest challenge is to provide new techniques for solving (non-)linear systems that combine scalability, portability, ne-grained parallelism and exibility across the assortment of parallel platforms and programming models. The goal of this thesis is to provide new ne-grained parallel algorithms embedded in advanced sparse linear algebra solvers and preconditioners on the emerging multi-core and many-core technologies. With respect to the mathematical methods, we focus on ecient iterative linear solvers. Here, we consider two types of solvers out-of-the-box solvers such as preconditioned Krylov subspace solvers (e.g. CG, BiCGStab, GMRES), and problem-aware solvers such as geometric matrix-based multi-grid methods. Clearly, the majority of the solvers can be written in terms of sparse matrixvector and vector-vector operations which can be performed in parallel. Our aim is to provide parallel, generic and portable preconditioners which are suitable for multi-core and many-core devices. We focus on additive (e.g. Gauss-Seidel, SOR), multiplicative (ILU factorization with or without ll-ins) and approximate inverse preconditioners. The preconditioners can also be used as smoothing schemes in the multi-grid methods via a preconditioned defect correction step. We treat the additive splitting schemes by a multi-coloring technique to provide the necessary level of parallelism. For controlling the ll-in entries for the ILU factorization we propose a novel method which we call the power(q)-pattern method. We prove that this algorithm produces 6 Summary and Further Work 83 A Source Code Examples for Preconditioned CG 85
doi:10.5445/ir/1000026568 fatcat:k6t5usvrrrghrijqvv5jpnqp3a