Stable and Efficient Spectral Divide and Conquer Algorithms for the Symmetric Eigenvalue Decomposition and the SVD

Yuji Nakatsukasa, Nicholas J. Higham
2013 SIAM Journal on Scientific Computing  
Spectral divide and conquer algorithms solve the eigenvalue problem for all the eigenvalues and eigenvectors by recursively computing an invariant subspace for a subset of the spectrum and using it to decouple the problem into two smaller subproblems. A number of such algorithms have been developed over the last 40 years, often motivated by parallel computing and, most recently, with the aim of achieving minimal communication costs. However, none of the existing algorithms has been proved to be
more » ... s been proved to be backward stable, and they all have a significantly higher arithmetic cost than the standard algorithms currently used. We present new spectral divide and conquer algorithms for the symmetric eigenvalue problem and the singular value decomposition that are backward stable, achieve lower bounds on communication costs recently derived by Ballard, Demmel, Holtz, and Schwartz, and have operation counts within a small constant factor of those for the standard algorithms. The new algorithms are built on the polar decomposition and exploit the recently developed QR-based dynamically weighted Halley algorithm of Nakatsukasa, Bai, and Gygi, which computes the polar decomposition using a cubically convergent iteration based on the building blocks of QR factorization and matrix multiplication. The algorithms have great potential for efficient, numerically stable computations in situations where the cost of communication dominates the cost of arithmetic. ] build on the work of Bulgakov, Godunov, and Malyshev to develop an inverse-free spectral divide and conquer algorithm based solely on rank-revealing QR factorization and matrix multiplication, applying to the generalized eigenvalue problem. Huss-Lederman et al. [35] develop a hybrid Newton and inverse-free iteration approach for the generalized eigenvalue problem, targeting parallel computers. All the above contributions are aimed at general, nonsymmetric matrices. Spectral divide and conquer algorithms for symmetric matrices have been developed in the PRISM project [14] , [36] , and by Zhang, Zha, and Ying [50], [51] . Very recently, Demmel, Dumitriu, and Holtz [18] have shown how to exploit randomization to allow standard QR factorization to be used throughout the algorithm of Bai, Demmel, and Gu [9], while Ballard, Demmel, and Dumitriu [10] have developed this approach into algorithms that achieve, asymptotically, lower bounds on the costs of communication. The spectral divide and conquer algorithm of Bai and Demmel [7], which is based on the matrix sign function computed by the (scaled) Newton iteration, can also be implemented so as to attain communication lower bounds. With the exception of [51], the above papers develop and test algorithms for a single splitting step rather than for the recursive solution process that yields the eigensystem. Unfortunately, except for the scaled Newton approach, these spectral divide and conquer algorithms generally require significantly more arithmetic than the standard algorithms based on reduction to condensed (tridiagonal or bidiagonal) form, and the scaled Newton approach has poorer stability than standard algorithms in our experiments. In addition, none of them has been proven to be backward stable in floating point arithmetic. For example, the authors in [10], [18] use probabilistic arguments combined with randomization to argue that the backward error is acceptably small with high probability, and suggest rerunning the process with a randomized starting matrix when the backward error is large. The analyses in [50], [51] assume exact arithmetic. In this work we develop new spectral divide and conquer algorithms for the symmetric eigendecomposition and the singular value decomposition (SVD) that have a number of novel features. They • are based on the polar decomposition, so they can exploit fast, backward stable algorithms for computing the polar decomposition; • require just the building blocks of matrix multiplication and QR factorization (without pivoting); • do not suffer from a high floating point operation count, slow convergence caused by eigenvalues near a splitting point, or numerical instability-one or more of which have affected all previous spectral divide and conquer algorithms; • are backward stable under two mild conditions, both of which are easy to verify and almost always satisfied in practice; • asymptotically minimize communication while at the same time have arithmetic costs within a small factor of those for the standard methods used in LAPACK (≈ 3 for the eigenproblem and ≈ 2 for the SVD). The main tool underlying our algorithms is the QDWH (QR-based dynamically weighted Halley) algorithm of Nakatsukasa, Bai, and Gygi [45], which is a QR factorization-based algorithm for computing the polar decomposition. For the symmetric eigenproblem, the key fact is that the positive and negative invariant subspaces of a symmetric matrix can be efficiently computed via the orthogonal polar factor. This observation leads to our spectral divide and conquer algorithm QDWH-eig. For A1327 an n × n matrix A, the dominant cost of QDWH-eig is in performing six or fewer QR factorizations of 2n × n matrices. QDWH-eig is much more efficient than the other spectral divide and conquer algorithms proposed in the literature and is the first to have proven backward stability. Our SVD algorithm, QDWH-SVD, computes the polar decomposition A = U p H ∈ C m×n using the QDWH algorithm and then uses QDWH-eig to compute the eigendecomposition H = V ΣV * . The SVD of A is then A = (U p V )ΣV * = U ΣV * . The essential cost for computing the SVD is in performing QR factorizations of no more than six (m + n) × n matrices and six 2n × n matrices. We prove the backward stability of QDWH-eig and QDWH-SVD under two conditions: 1. The polar decompositions computed by QDWH are backward stable. 2. The column space of a computed orthogonal projection matrix is obtained in a backward stable manner. In [46] we prove that the first condition is satisfied when the QR factorizations are computed with column pivoting and row pivoting or sorting. In practice, even without pivoting QDWH almost always computes the polar decomposition in a backward stable manner, as demonstrated by the experiments in [45] , [46] . The second condition holds in exact arithmetic for a subspace obtained by a single step of subspace iteration, and in practice it holds most of the time. Hence both conditions are usually satisfied in practice, and can be inexpensively verified if necessary. Note that backward stability of our algorithms implies that the eigen(singular) values are computed accurately in the absolute (but not necessarily relative) sense, as the eigenvalues of a symmetric matrix and singular values of any matrix are all well conditioned. As proof of concept, we perform numerical experiments with our algorithms on a shared-memory machine with four processors, employing (via MATLAB function qr) the conventional LAPACK QR factorization algorithm that does not minimize communication. Even under such conditions, our algorithms are not too much slower than the conventional algorithms, and are sometimes faster than the traditional eigenvalue and SVD algorithms based on the QR algorithm. On massively parallel computing architectures we expect that the communication-optimality of our algorithms will improve the performance significantly. Furthermore, while both our algorithms and most conventional algorithms are proven to be backward stable and to yield numerically orthogonal computed eigen(singular)-vectors, experiments demonstrate that our algorithms usually give considerably smaller backward errors and vectors closer to orthogonality. Moreover, our algorithms tend to compute the eigenvalues and singular values more accurately, and determine the rank more reliably when applied to rank-deficient matrices. The rest of the paper is organized as follows. In section 2 we summarize the QR factorization-based polar decomposition algorithm QDWH [45] . In section 3 we develop QDWH-eig, establish its backward stability, and compare it with other algorithms for symmetric eigenproblems. Section 4 describes our SVD algorithm, QDWH-SVD. Section 5 addresses practical implementation issues and techniques to enhance the performance of the algorithms. Numerical experiments are described in section 6 and conclusions are presented in section 7. We summarize our notation. The singular values of an m × n rectangular matrix A with m ≥ n are σ 1 (A) ≥ · · · ≥ σ n (A) and we write σ max (A) = σ 1 (A) and σ min (A) = σ n (A). A 2 = σ max (A) denotes the spectral norm and A F = ( i,j |a ij | 2 ) 1/2 is the Frobenius norm. For a Hermitian matrix A, λ i (A) denotes the ith largest eigenvalue and eig(A) is the set of all eigenvalues. κ 2 (A) = σ max (A)/σ min (A) is the condition Downloaded 08/09/13 to Redistribution subject to SIAM license or copyright; see
doi:10.1137/120876605 fatcat:itz4aejkfvezneeaglgjfxirrq