An out-of-core sparse Cholesky solver

John K. Reid, Jennifer A. Scott
2009 ACM Transactions on Mathematical Software  
Direct methods for solving large sparse linear systems of equations are popular because of their generality and robustness. Their main weakness is that the memory they require usually increases rapidly with problem size. We discuss the design and development of the first release of a new symmetric direct solver that aims to circumvent this limitation by allowing the system matrix, intermediate data, and the matrix factors to be stored externally. The code, which is written in Fortran and called
more » ... HSL MA77, implements a multifrontal algorithm. The first release is for positivedefinite systems and performs a Cholesky factorization. Special attention is paid to the use of efficient dense linear algebra kernel codes that handle the full-matrix operations on the frontal matrix and to the input/output operations. The input/output operations are performed using a separate package that provides a virtual-memory system and allows the data to be spread over many files; for very large problems these may be held on more than one device. Numerical results are presented for a collection of 30 large real-world problems, all of which were solved successfully. of computational scientists for more accurate models increases, so inevitably do the sizes of the systems that must be solved and thus the memory needed by direct solvers. The amount of main memory available on computers has increased enormously in recent years and this has allowed direct solvers to be used to solve many more problems than was previously possible using only main memory. However, the memory required by direct solvers generally increases much more rapidly than the problem size so that they can quickly run out of memory, particularly when the linear systems arise from discretizations of three-dimensional (3D) problems. One solution has been to use parallel computing, for example, by using the MUMPS package [MUMPS 2007 ]. For many users, the option of using such a computer is either not available or is too expensive. An obvious alternative is to use an iterative method in place of a direct one. A carefully chosen and tuned preconditioned iterative method will often run significantly faster than a direct solver and will require far less memory. However, for many of the "tough" systems that arise from practical applications, the difficulties involved in finding and computing a good preconditioner can make iterative methods infeasible. An alternative is to use a direct solver that is able to hold its data structures on disk, that is, an out-ofcore solver. The idea of out-of-core linear solvers is not new. Indeed, the first author of this article wrote an out-of-core multifrontal solver for finite-element systems more than 20 years ago [Reid 1984] and the HSL mathematical software library [HSL 2007] has included out-of-core frontal solvers since about that time. The HSL package MA42 [Duff and Scott 1996] is particularly widely used, both by academics and as the linear solver within a number of commercial packages. The Boeing library [BCSLIB-EXT 2003 ] also includes multifrontal solvers with out-of-core facilities. More recently, a number of researchers [Dobrian and Pothen 2003; Rothberg and Schreiber 1999; Rotkin and Toledo 2004] have proposed out-of-core sparse symmetric solvers. In this article, we discuss the design and development of the first release of a new HSL sparse symmetric out-of-core solver. The system matrix A, intermediate data, and the factors may be stored externally. The code, which is written in Fortran and called HSL MA77, implements a multifrontal algorithm. The first release is for positive-definite systems and performs a Cholesky factorization. The second release has an option that incorporates numerical pivoting using 1 × 1 and 2 × 2 pivots, which extends the package to indefinite problems. An alternative to the multifrontal algorithm is a left-looking strategy, where the column updates are delayed until the column is about to be eliminated. During the factorization, less data needs to be stored, but it has to be read many times. Our decision to use the multifrontal algorithm is based on our having extensive experience with this method and on not having seen evidence for its being consistently inferior. This article describes the design of HSL MA77, explaining many of the design decisions and highlighting key features of the package. Section 2 provides a brief overview of the multifrontal method. In Section 3, we describe the
doi:10.1145/1499096.1499098 fatcat:ttt2ruuhn5h3tknywli5bx47oa