Approximate resilience, monotonicity, and the complexity of agnostic learning [chapter]

Dana Dachman-Soled, Vitaly Feldman, Li-Yang Tan, Andrew Wan, Karl Wimmer
2014 Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms  
A function f is d-resilient if all its Fourier coefficients of degree at most d are zero, i.e. f is uncorrelated with all low-degree parities. We study the notion of approximate resilience of Boolean functions, where we say that f is α-approximately d-resilient if f is α-close to a [−1, 1]-valued d-resilient function in ℓ 1 distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class C over the uniform distribution. Roughly
more » ... ng, if all functions in a class C are far from being d-resilient then C can be learned agnostically in time n O(d) and conversely, if C contains a function close to being d-resilient then agnostic learning of C in the statistical query (SQ) framework of Kearns has complexity of at least n Ω(d) . This characterization is based on the duality between ℓ 1 approximation by degree-d polynomials and approximate d-resilience that we establish. In particular, it implies that ℓ 1 approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal α-approximately Ω(α √ n)-resilient monotone functions for all α > 0. Prior to our work, it was conceivable even that every monotone function is Ω(1)-far from any 1-resilient function. Furthermore, we construct simple, explicit monotone functions based on Tribes and CycleRun that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas, a natural variant of the well-studied junta learning problem. In particular we show that no SQ algorithm can efficiently agnostically learn monotone k-juntas for any k = ω(1) and any constant error less than 1/2.
doi:10.1137/1.9781611973730.34 dblp:conf/soda/Dachman-SoledFTWW15 fatcat:hsqwgnhnwzaexnqoiihqsxh22m