An Optimal Algorithm for Large Frequency Moments Using O(n 1−2/k ) Bits * †

Vladimir Braverman, Jonathan Katzman, Charles Seidell, Gregory Vorsanger
unpublished
In this paper, we provide the first optimal algorithm for the remaining open question from the seminal paper of Alon, Matias, and Szegedy: approximating large frequency moments. Given a stream D = {p 1 , p 2 ,. .. , p m } of numbers from {1,. .. , n}, a frequency of i is defined as f i = |{j : p j = i}|. The k-th frequency moment of D is defined as F k = n i=1 f k i. We give an upper bound on the space required to find a k-th frequency moment of O(n 1−2/k) bits that matches, up to a constant
more » ... tor, the lower bound of [48] for constant and constant k. Our algorithm makes a single pass over the stream and works for any constant 1 k > 3. It is based upon two major technical accomplishments: first, we provide an optimal algorithm for finding the heavy elements in a stream; and second, we provide a technique using Martingale Sketches which gives an optimal reduction of the large frequency moment problem to the all heavy elements problem. Additionally, this reduction works for any function g of the form n i=1 g(f i) that requires sub-linear polynomial space, and it works in the more general turnstile model. As a result, we also provide a polylogarithmic improvement for frequency moments, frequency based functions, spatial data streams, and measuring independence of data sets.
fatcat:x4jwvsbyz5ddxdlgwhn3yu6xge