Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching [article]

Arturs Backurs, Piotr Indyk, Eric Price, Ilya Razenshteyn, David P. Woodruff
2015 arXiv   pre-print
We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically, for a norm ·, sparsity parameter k, approximation factor K>0, and probability of failure P>0, we ask: what is the minimal value of m so that there is a distribution over m × n matrices A with the property that for any x, given Ax, we can recover a k-sparse approximation to x in the given norm with probability at least 1-P? We give a partial answer to
more » ... s problem, by showing that for norms that admit efficient linear sketches, the optimal number of measurements m is closely related to the doubling dimension of the metric induced by the norm · on the set of all k-sparse vectors. By applying our result to specific norms, we cast known measurement bounds in our general framework (for the ℓ_p norms, p ∈ [1,2]) as well as provide new, measurement-efficient schemes (for the Earth-Mover Distance norm). The latter result directly implies more succinct linear sketches for the well-studied planar k-median clustering problem. Finally, our lower bound for the doubling dimension of the EMD norm enables us to address the open question of [Frahling-Sohler, STOC'05] about the space complexity of clustering problems in the dynamic streaming model.
arXiv:1504.01076v1 fatcat:jxyastg24nbi7dxmicdddwogha