Nearly optimal sparse fourier transform

Haitham Hassanieh, Piotr Indyk, Dina Katabi, Eric Price
2012 Proceedings of the 44th symposium on Theory of Computing - STOC '12
We consider the problem of computing the k-sparse approximation to the discrete Fourier transform of an ndimensional signal. We show: • An O(k log n)-time randomized algorithm for the case where the input signal has at most k non-zero Fourier coefficients, and • An O(k log n log(n/k))-time randomized algorithm for general input signals. Both algorithms achieve o(n log n) time, and thus improve over the Fast Fourier Transform, for any k = o(n). They are the first known algorithms that satisfy
more » ... s property. Also, if one assumes that the Fast Fourier Transform is optimal, the algorithm for the exactly k-sparse case is optimal for any k = n Ω(1) . We complement our algorithmic results by showing that any algorithm for computing the sparse Fourier transform of a general signal must use at least Ω(k log(n/k)/ log log n) signal samples, even if it is allowed to perform adaptive sampling. √ nk log 3/2 n) runtime. This algorithm outperforms FFT for any k smaller than Θ(n/ log n). Despite impressive progress on sparse DFT, the state of the art suffers from two main limitations: 1. None of the existing algorithms improves over FFT's runtime for the whole range of sparse signals, i.e., k = o(n). 2. Most of the aforementioned algorithms are quite complex, and suffer from large "big-Oh" constants (the algorithm of [HIKP12b] is an exception, but has a running time that is polynomial in n). Results. In this paper, we address these limitations by presenting two new algorithms for the sparse Fourier transform. We require that the length n of the input signal is a power of 2. We show: • An O(k log n)-time algorithm for the exactly k-sparse case, and • An O(k log n log(n/k))-time algorithm for the general case. The key property of both algorithms is their ability to achieve o(n log n) time, and thus improve over the FFT, for any k = o(n). These algorithms are the first known algorithms that satisfy this property. Moreover, if one assume that FFT is optimal and hence the DFT cannot be computed in less than O(n log n) time, the algorithm for the exactly k-sparse case is optimal 4 as long as k = n Ω(1) . Under the same assumption, the result for the general case is at most one log log n factor away from the optimal runtime for the case of "large" sparsity k = n/ log O(1) n. Furthermore, our algorithm for the exactly sparse case (depicted as Algorithm 3.1 on page 5) is quite simple and has low big-Oh constants. In particular, our preliminary implementation of a variant of this algorithm is faster than FFTW, a highly efficient implementation of the FFT, for n = 2 22 and k ≤ 2 17 [HIKP12a]. In contrast, for the same signal size, prior algorithms were faster than FFTW only for k ≤ 2000 [HIKP12b]. 5 We complement our algorithmic results by showing that any algorithm that works for the general case must use at least Ω(k log(n/k)/ log log n) samples from x. The lower bound uses techniques from [PW11], which shows a lower bound of Ω(k log(n/k)) for the number of arbitrary linear measurements needed to compute the k-sparse approximation of an n-dimensional vector x. In comparison to [PW11], our bound is slightly worse but it holds even for adaptive sampling, where the algorithm selects the samples based on the values of the previously sampled coordinates. 6 Note that our algorithms are non-adaptive, and thus limited by the more stringent lower bound of [PW11]. 1 The algorithm of [Man92], as stated in the paper, addresses only the exactly k-sparse case. However, it can be extended to the general case using relatively standard techniques. 2 All of the above algorithms, as well as the algorithms in this paper, need to make some assumption about the precision of the input; otherwise, the right-hand-side of the expression in Equation (1) contains an additional additive term. See Preliminaries for more details. 3 The paper does not estimate the exact value of c. We estimate that c ≈ 3. 4 One also needs to assume that k divides n. See Section 5 for more details. 5 Note that both numbers (k ≤ 2 17 and k ≤ 2000) are for the exactly k-sparse case. The algorithm in [HIKP12b] can deal with the general case, but the empirical runtimes are higher. 6 Note that if we allow arbitrary adaptive linear measurements of a vector x, then its k-sparse approximation can be computed using only O(k log log(n/k)) samples [IPW11] . Therefore, our lower bound holds only where the measurements, although adaptive, are limited to those induced by the Fourier matrix. This is the case when we want to compute a sparse approximation to x from samples of x. 7 One can randomize the positions of the frequencies by sampling the signal in time domain appropriately [GGI + 02, GMS05]. See Preliminaries for the description. 8 We note that although the two-sample approach employed in our algorithm works in theory only for the exactly k-sparse case, our preliminary experiments show that using a few more samples to estimate the phase works surprisingly well even for general signals. 9 In particular, the method of [GLPS10] uses measurements corresponding to a random error correcting code.