Assembly Operations for Multicore Architectures Using Task-Based Runtime Systems [chapter]

Damien Genet, Abdou Guermouche, George Bosilca
2014 Lecture Notes in Computer Science  
Traditionally, numerical simulations based on finite element methods consider the algorithm as being divided in three major steps: the generation of a set of blocks and vectors, the assembly of these blocks in a matrix and a big vector, and the inversion of the matrix. In this paper we tackle the second step, the block assembly, where no parallel algorithm is widely available. Several strategies are proposed to decompose the assembly problem while relying on a scheduling middle-ware to maximize
more » ... the overlap between stages and increase the parallelism and thus the performance. These strategies are quantified using examples covering two extremes in the field, large number of non-overlapping small blocks for CFD-like problems, and a smaller number of larger blocks with significant overlap which can be met in sparse linear algebra solvers.
doi:10.1007/978-3-319-14313-2_29 fatcat:a3vl4ycfmvhm7opvvf3tdkrmgu