Filters








1,828 Hits in 3.8 sec

Two-Stage Gauss–Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster [article]

Luc Berger-Vergiat, Brian Kelley, Sivasankaran Rajamanickam, Jonathan Hu, Katarzyna Swirydowicz, Paul Mullowney, Stephen Thomas, Ichitaro Yamazaki
2021 arXiv   pre-print
However, the requisite sparse triangular solve is difficult to parallelize on many-core architectures such as graphics processing units (GPUs).  ...  Gauss-Seidel (GS) relaxation is often employed as a preconditioner for a Krylov solver or as a smoother for Algebraic Multigrid (AMG).  ...  To avoid explicitly forming the inverse M −1 in (3.1), a sparse triangular solve is applied to the current residual vector r k .  ... 
arXiv:2104.01196v2 fatcat:myrcryb7qffxjhn5apgu4rgtva

Sparse matrix factorization in the implicit finite element method on petascale architecture

Seid Koric, Anshul Gupta
2016 Computer Methods in Applied Mechanics and Engineering  
The performance of the massively parallel direct multifrontal solver Watson Sparse Matrix Package (WSMP) for solving large sparse systems of linear equations arising in implicit finite element method on  ...  The results show that a direct multifrontal factorization method with a hybrid parallel implementation in WSMP performs exceedingly well on a petascale high-performance computing (HPC) system, and delivers  ...  A general comparison of several sparse linear solvers on a modern multi-core HPC cluster with large sparse matrices originating from practical 3D FEA discretization has recently been performed lately by  ... 
doi:10.1016/j.cma.2016.01.011 fatcat:7fvypjoinrcmnigv3hldkqfcxe

A study of the existing linear algebra libraries that you can use from C++ (Une étude des bibliothèques d'algèbre linéaire utilisables en C++) [article]

Claire Mouton
2011 arXiv   pre-print
A study of the existing linear algebra libraries that you can use from C++  ...  vectors -Dense matrices: several formats for rectangular, symmetric, hermitian and triangular -Two sparse matrix forms: Harwell-Boeing and array of sparse vectors -3D arrays Linear algebra operations  ...  -Triangular, SVD, Cholesky, QR and LU solvers -Eigen values/vectors solver for non-selfadjoint matrices -Hessemberg decomposition -Tridiagonal decomposition of a selfadjoint matrix Interface with other  ... 
arXiv:1103.3020v1 fatcat:327td5jy7jacdooqqidj24lfzq

Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures [article]

Chenhao Xie, Jieyang Chen, Jesun S Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li
2020 arXiv   pre-print
This is particularly the case for Sparse Triangular Solver (SpTRSV) which introduces additional two-dimensional computation dependencies among subsequent computation steps.  ...  Designing efficient and scalable sparse linear algebra kernels on modern multi-GPU based HPC systems is a daunting task due to significant irregular memory references and workload imbalance across the  ...  [10] propose a 3D sparse structure to replicate the dependent data for avoiding expensive communication.  ... 
arXiv:2012.06959v1 fatcat:am7guw7i5fchxafrkp34plwvky

Accelerating advanced preconditioning methods on hybrid architectures

Ernesto Dufrechou
2021 CLEI Electronic Journal  
In particular, we study ILUPACK, a package for the solution of sparse linear systems via Krylov subspace methods that relies on a modern inverse-based multilevel ILU (incomplete LU) preconditioning technique  ...  We present new data-parallel versions of the preconditioner and the most important solvers contained in the package that significantly improve its performance without affecting its accuracy.  ...  [34] proposed a new GPU solver for sparse triangular systems, for matrices stored in the CSC format, based on the self-scheduled strategy.  ... 
doi:10.19153/cleiej.24.1.6 doaj:cf900516b6334e27afbe4102fa203079 fatcat:ohhmcgyrl5hgfaib7hb2rdyhom

QRkit: Sparse, Composable QR Decompositions for Efficient and Stable Solutions to Problems in Computer Vision [article]

Jan Svoboda, Thomas Cashman, Andrew Fitzgibbon
2018 arXiv   pre-print
This accuracy can be regained using solvers based on QR rather than Cholesky decomposition, but the absence of sparse QR solvers for common sparsity patterns found in computer vision means that many applications  ...  We introduce an open-source suite of solvers for Eigen, which efficiently compute the QR decomposition for matrices with some common sparsity patterns (block diagonal, horizontal and vertical concatenation  ...  QRkit Our new kit of sparse QR factorizations directly implemented as a submodule of the Eigen C++ library. Eigen Sparse QR Current implementation of Sparse QR solver in the Eigen C++ library.  ... 
arXiv:1802.03773v1 fatcat:ffmisedkiffr3o7udxd2zdofle

Communication in task-parallel ILU-preconditioned CG solvers using MPI + OmpSs

José I. Aliaga, María Barreda, Goran Flegar, Matthias Bollhöfer, Enrique S. Quintana-Ortí
2017 Concurrency and Computation  
In addition, we integrate several communication-avoiding (CA) strategies into the codes, including the butterfly communication scheme and Eijkhout's formulation of the CG method.  ...  Moreover, it can be computed COMMUNICATION IN TASK-PARALLEL ILU-PCG SOLVERS USING MPI+OMPSS Here, we note that z 0 = [z 0 |z 2 ] are stored in P0, while P1 contains a copy of z 1 = [z 1 |z 2 ].  ...  COMMUNICATION-AVOIDING TECHNIQUES FOR ILU-PCG SOLVERS Butterfly communication pattern The butterfly transformation is an efficient communication scheme for collective operations [9, 10] .  ... 
doi:10.1002/cpe.4280 fatcat:m7hsdjutjnhfhmxxl4q5izo6ka

Domain Overlap for Iterative Sparse Triangular Solves on GPUs [chapter]

Hartwig Anzt, Edmond Chow, Daniel B. Szyld, Jack Dongarra
2016 Lecture Notes in Computational Science and Engineering  
., more subdomains and threads than cores, there is a preference in processing or scheduling the subdomains in a specific order, following the dependencies specified by the sparse triangular matrix.  ...  Iterative methods for solving sparse triangular systems are an attractive alternative to exact forward and backward substitution if an approximation of the solution is acceptable.  ...  Here, the triangular matrix is written as a product of sparse triangular factors, and each triangular solve becomes a sequence of sparse matrix vector multiplications.  ... 
doi:10.1007/978-3-319-40528-5_24 fatcat:pxzjssbq4bhcjdxehti3pagkqi

miniSAM: A Flexible Factor Graph Non-linear Least Squares Optimization Framework [article]

Jing Dong, Zhaoyang Lv
2019 arXiv   pre-print
list of sparse linear solvers, including CUDA enabled sparse linear solvers.  ...  Compared to most existing frameworks for least squares solvers, miniSAM has (1) full Python/NumPy API, which enables more agile development and easy binding with existing Python projects, and (2) a wide  ...  use third-party sparse linear solvers.  ... 
arXiv:1909.00903v1 fatcat:x3nhzopuivhmzgvu7e6g3evlb4

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices

Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mosotofa Ali Patwary, Yutong Lu, Pradeep Dubey
2014 SC14: International Conference for High Performance Computing, Networking, Storage and Analysis  
A new sparse high performance conjugate gradient benchmark (HPCG) has been recently released to address challenges in the design of sparse linear solvers for the next generation extreme-scale computing  ...  In addition, we demonstrate that our optimizations not only benefit HPCG original dataset, which is based on structured 3D grid, but also a wide range of unstructured matrices.  ...  We also thank Ludovic Sauge at BULL, Shane Story, and Vadim Pirogov for help communicate with various institutions.  ... 
doi:10.1109/sc.2014.82 dblp:conf/sc/ParkSVHKLPLD14 fatcat:ktiisywie5hhznon5qc2tuydoa

PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation

Olaf Schenk, Klaus Gärtner, Wolfgang Fichtner, Andreas Stricker
2001 Future generations computer systems  
The package PARDISO is a high-performance, robust and easy to use software for solving large sparse symmetric or structurally symmetric linear systems of equations on shared memory multiprocessors.  ...  It delivers up to 960 Mflop/s on COMPAQ Alpha ES40 (667 MHz) for irregular problems and sparse matrix factorization has been clocked up at a speedup of 7 on an 8-node SGI Origin 2000.  ...  The special device study was performed in a joint project with Robert Bosch GmbH, Germany.  ... 
doi:10.1016/s0167-739x(00)00076-5 fatcat:zbn2pwehtfaopemagwqma3gtzu

Parallel finite element technique using Gaussian belief propagation

Yousef El-Kurdi, Maryam Mehri Dehnavi, Warren J. Gross, Dennis Giannacopoulos
2015 Computer Physics Communications  
In addition, such a solver would still require assembling a large sparse data-structure.  ...  Giannacopoulos). 1 These authors contributed equally to this work. communication-bound; recent attempts to improve the communication overhead of such solvers through reformulation, namely communication  ...  FGaBP avoids assembling a global sparse matrix, thus, is a promising candidate for manycore architectures.  ... 
doi:10.1016/j.cpc.2015.03.019 fatcat:fbfz5ce66fcbjc26sh7ukn67kq

Kokkos Kernels: Performance Portable Sparse/Dense Linear Algebra and Graph Kernels [article]

Sivasankaran Rajamanickam, Seher Acer, Luc Berger-Vergiat, Vinh Dang, Nathan Ellingwood, Evan Harvey, Brian Kelley, Christian R. Trott, Jeremiah Wilke, Ichitaro Yamazaki
2021 arXiv   pre-print
We describe Kokkos Kernels, a library of kernels for sparse linear algebra, dense linear algebra and graph kernels.  ...  Specifically, we demonstrate the performance of four sparse kernels, three dense batched kernels, two graph kernels and one team level algorithm.  ...  (SpGEMM) and sparse triangular solver (SpTRSV).  ... 
arXiv:2103.11991v1 fatcat:m7iskgt5kjdjnex7lenqsjj6z4

Using Random Butterfly Transformations in Parallel Schur Complement-Based Preconditioning

Marc Baboulin, Aygul Jamal, Masha Sosonkina
2015 Proceedings of the 2015 Federated Conference on Computer Science and Information Systems  
of sparse linear systems.  ...  We propose to use a randomization technique based on Random Butterfly Transformations (RBT) in the Algebraic Recursive Multilevel Solver (ARMS) to improve the preconditioning phase in the iterative solution  ...  RBT is a random transformation of A which can avoid pivoting and then can reduce the amount of communication.  ... 
doi:10.15439/2015f177 dblp:conf/fedcsis/BaboulinJS15 fatcat:7udjszd5yrekjotj3mm3gcju4y

Evaluation of parallel direct sparse linear solvers in electromagnetic geophysical problems

Vladimir Puzyrev, Seid Koric, Scott Wilkin
2016 Computers & Geosciences  
A major computational bottleneck of modeling and inversion algorithms is solving the large sparse systems of linear ill-conditioned equations in complex domains with multiple right hand sides.  ...  Wide use of direct methods utilizing modern parallel architectures will allow modeling tools to accurately support multi-source surveys and 3D data acquisition geometries, thus promoting a more efficient  ...  Direct methods while avoiding convergence problems of iterative solvers, they are much more expensive in terms of memory due to fill-in in triangular factors.  ... 
doi:10.1016/j.cageo.2016.01.009 fatcat:oigknlzm6ndzlizmbngtoewbry
« Previous Showing results 1 — 15 out of 1,828 results