Estimation of Shortest Path Covariance Matrices [article]

Raj Kumar Maity, Cameron Musco
2020 arXiv   pre-print
We study the sample complexity of estimating the covariance matrix Σ∈ℝ^d× d of a distribution 𝒟 over ℝ^d given independent samples, under the assumption that Σ is graph-structured. In particular, we focus on shortest path covariance matrices, where the covariance between any two measurements is determined by the shortest path distance in an underlying graph with d nodes. Such matrices generalize Toeplitz and circulant covariance matrices and are widely applied in signal processing applications,
more » ... where the covariance between two measurements depends on the (shortest path) distance between them in time or space. We focus on minimizing both the vector sample complexity: the number of samples drawn from 𝒟 and the entry sample complexity: the number of entries read in each sample. The entry sample complexity corresponds to measurement equipment costs in signal processing applications. We give a very simple algorithm for estimating Σ up to spectral norm error ϵΣ_2 using just O(√(D)) entry sample complexity and Õ(r^2/ϵ^2) vector sample complexity, where D is the diameter of the underlying graph and r ≤ d is the rank of Σ. Our method is based on extending the widely applied idea of sparse rulers for Toeplitz covariance estimation to the graph setting. In the special case when Σ is a low-rank Toeplitz matrix, our result matches the state-of-the-art, with a far simpler proof. We also give an information theoretic lower bound matching our upper bound up to a factor D and discuss some directions towards closing this gap.
arXiv:2011.09986v1 fatcat:2mortzq7hraerhe36jkgdkfade