Parallel Implementation of the Coordinates-partitioning Based Aggregation-type Algebraic Multigrid Preconditioners

Jian-ping WU, Jun ZHAO, Shu-chang WANG
2018 DEStech Transactions on Computer Science and Engineering  
The coordinates-partitioning based aggregation-type algebraic multigrid preconditioners have been proven to be very efficient in the solution of sparse linear systems with conjugate gradient iterations. In this paper, a parallel algorithm for the setup is provided and the parallelization of the preconditioning process is also considered. The parallel algorithm is based on a good property of the coordinates-partitioning based aggregation, that is, the aggregation process is performed in a
more » ... from the coarsest level to the finest step by step. Thus, the original adjacent graph is partitioned into a number of sub-graphs, where each sub-graph is related to a node in the coarsest level and is assigned to a processor. The aggregation process can then proceed forward on each processor independently from the assigned sub-graph. When this kind of multigrid preconditioner is applied in Krylov subspace iterations, only the computation on the coarsest level and the matrix-vector multiplications related to the smoothing on each level require communication for V-and W-cycle versions, and require only some extra communication related to dot products for the K-cycle version. The size of the derived linear system on the coarsest level can be controlled by the adjustable arguments and this system can be solved again in parallel with some preconditioned Krylov subspace iterations. The structure information related to the matrix-vector multiplications is invariable to the iterations and can be derived in the setup and be stored, and is used in the latter iterations. Finally, the parallel algorithm is validated for the Gauss-Seidel smoother and some kinds of popular multigrid cycles in solving sparse linear systems from some two-dimensional model partial equations with preconditioned conjugate gradient iterations. The results show that the parallel efficiencies of both the setup and the iteration processes are satisfied.
doi:10.12783/dtcse/mmsta2017/19671 fatcat:qbiemg4olbapjdnlsotrwei7sy