The sample complexity of agnostic learning with deterministic labels

Shai Ben-David, Ruth Urner
2014 International Symposium on Artificial Intelligence and Mathematics  
We investigate agnostic learning when there is no noise in the labeling function, that is, the labels are deterministic. We show that in this setting, in contrast to the fully agnostic learning setting (with possibly noisy labeling functions), the sample complexity of learning a binary hypothesis class is not fully determined by the VCdimension of the class. For any d, we present classes of VC-dimension d that are learnable from O(d/ ) many samples and classes that require samples of sizes Ω(d/
more » ... 2 ). Furthermore, we show that in this setting, there exist classes where ERM algorithms are not optimal: While the class can be learned with sample complexity O(d/ ), the convergence rate of any ERM algorithm is only Ω(d/ 2 ). We introduce a new combinatorial parameter of a class of binary valued functions and show that it provides a full combinatorial characterization of the sample complexity of deterministic -label agnostic learning of a class.
dblp:conf/isaim/Ben-DavidU14 fatcat:jgzamnuswrb2jjuf6m6z3ooskm