Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Chenhan D. Yu, Severin Reiz, George Biros
2018 SC18: International Conference for High Performance Computing, Networking, Storage and Analysis  
We present a distributed memory algorithm for the hierarchical compression of symmetric positive definite (SPD) matrices. Our method is based on GOFMM, an algorithm that appeared in doi:10.1145/3126908.3126921. For many SPD matrices, GOFMM enables compression and approximate matrix-vector multiplication that for many matrices can reach N log N time-as opposed to N 2 required for a dense matrix. But GOFMM supports only shared memory parallelism. In this paper, we use the message passing
more » ... (MPI) and extend the ideas of GOFMM to the distributed memory setting. We also propose and implement an asynchronous algorithm for faster multiplication. We present different usage scenarios on a selection of SPD matrices that are related to graphs, neural-networks, and covariance operators. We present results on the Texas Advanced Computing Center's "Stampede 2" system. We also compare with the STRUMPACK software package, which, to our knowledge, is the only other available software that can compress arbitrary SPD matrices in parallel. In our largest run, we were able to compress a 67M-by-67M matrix in less than three minutes and perform a multiplication with 512 vectors within 5 seconds on 6,144 Intel "Skylake" cores.
doi:10.1109/sc.2018.00018 fatcat:fawrwh3lfjeebg5meygmykxbn4