Information-theoretic limits on sparse support recovery: Dense versus sparse measurements
2008 IEEE International Symposium on Information Theory
We study the information-theoretic limits of exactly recovering the support of a sparse signal using noisy projections defined by various classes of measurement matrices. Our analysis is high-dimensional in nature, in which the number of observations n, the ambient signal dimension p, and the signal sparsity k are all allowed to tend to infinity in a general manner. This paper makes two novel contributions. First, we provide sharper necessary conditions for exact support recovery using general
... non-Gaussian) dense measurement matrices. Combined with previously known sufficient conditions, this result yields sharp characterizations of when the optimal decoder can recover a signal for various scalings of the sparsity k and sample size n, including the important special case of linear sparsity (k = Θ(p)) using a linear scaling of observations (n = Θ(p)). Our second contribution is to prove necessary conditions on the number of observations n required for asymptotically reliable recovery using a class of γ-sparsified measurement matrices, where the measurement sparsity γ(n, p, k) ∈ (0, 1] corresponds to the fraction of non-zero entries per row. Our analysis allows general scaling of the quadruplet (n, p, k, γ), and reveals three different regimes, corresponding to whether measurement sparsity has no effect, a minor effect, or a dramatic effect on the information-theoretic limits of the subset recovery problem. analysis has two purposes: first, to demonstrate where known polynomial-time methods achieve the information-theoretic bounds, and second, to reveal situations in which current methods are sub-optimal. An interesting question which arises in this context is the effect of the choice of measurement matrix on the information-theoretic limits of sparsity recovery. As we will see, the standard Gaussian measurement ensemble is an optimal choice in terms of minimizing the number of observations required for recovery. However, this choice produces highly dense measurement matrices, which may lead to prohibitively high computational complexity and storage requirements. Sparse matrices can reduce this complexity, and also lower communication cost and latency in distributed network and streaming applications. On the other hand, such measurement sparsity, though beneficial from the computational standpoint, may reduce statistical efficiency by requiring more observations to decode. Therefore, an important issue is to characterize the trade-off between measurement sparsity and statistical efficiency. With this motivation, this paper makes two contributions. First, we derive sharper necessary conditions for exact support recovery, applicable to a general class of dense measurement matrices (including non-Gaussian ensembles). In conjunction with the sufficient conditions from previous work  , this analysis provides a sharp characterization of necessary and sufficient conditions for various sparsity regimes. Our second contribution is to address the effect of measurement sparsity, meaning the fraction γ ∈ (0, 1] of non-zeros per row in the matrices used to collect measurements. We derive lower bounds on the number of observations required for exact sparsity recovery, as a function of the signal dimension p, signal sparsity k, and measurement sparsity γ. This analysis highlights a trade-off between the statistical efficiency of a measurement ensemble and the computational complexity associated with storing and manipulating it. The remainder of the paper is organized as follows. We first define our problem formulation in Section 1.1, and then discuss our contributions and some connections to related work in Section 1.2. Section 2 provides precise statements of our main results, as well as a discussion of their consequences. Section 3 provides proofs of the necessary conditions for various classes of measurement matrices, while proofs of more technical lemmas are given in the appendices. Finally, we conclude and discuss open problems in Section 4.