Complexity Theoretic Limitations on Learning Halfspaces [article]

Amit Daniely
2016 arXiv   pre-print
We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on Q^n×{± 1}. We define Err_HALF(D) as the least error of a halfspace classifier for D. A learner who can access D has to return a hypothesis whose error is small compared to Err_HALF(D). Using the recently developed method of the author, Linial and Shalev-Shwartz we prove hardness of learning results under a natural assumption on the complexity of refuting random K-XOR formulas. We
more » ... ow that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that Err_HALF(D) <η for arbitrarily small constant η>0, and that D is supported in {± 1}^n×{± 1}. Namely, even under these favorable conditions its error must be >1/2-1/n^c for every c>0. In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where K can be poly-logarithmic in n), we can take η = 2^-^1-ν(n) for arbitrarily small ν>0. Interestingly, this is even stronger than the best known lower bounds (Arora et. al. 1993, Feldamn et. al. 2006, Guruswami and Raghavendra 2006) for the case that the learner is restricted to return a halfspace classifier (i.e. proper learning).
arXiv:1505.05800v2 fatcat:udprem4j7zdvvksswzzeowmwku