Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization [article]

Geoff Pleiss, Martin Jankowiak, David Eriksson, Anil Damle, Jacob R. Gardner
2020 arXiv   pre-print
Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians 𝒩(0, 𝐊) or whitening a vector 𝐛 against covariance matrix 𝐊. While existing methods typically require O(N^3) computation, we introduce a highly-efficient quadratic-time algorithm for computing 𝐊^1/2𝐛, 𝐊^-1/2𝐛, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically
more » ... es 4 decimal places of accuracy with fewer than 100 MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as 50,000 × 50,000 - well beyond traditional methods - with little approximation error. Applying this increased scalability to variational Gaussian processes, Bayesian optimization, and Gibbs sampling results in more powerful models with higher accuracy.
arXiv:2006.11267v2 fatcat:twbpjfca2nc5npww4q442itdoq