On the Difficulty of Designing Good Classifiers

Michelangelo Grigni, Vincent Mirelli, Christos H. Papadimitriou
2000 SIAM journal on computing (Print)  
It is a very interesting and well-studied problem, given two point sets W; B < n , to design a decision tree that classi es them |that is, no leaf subdivision contains points from both B and W | and is as simple as possible, either in terms of the total number of nodes, or in terms of its depth. We show that, unless ZPP=NP, the depth of a classi er cannot be approximated within a factor smaller than 6=5, and that the total number of nodes cannot be approximated within a factor smaller than n
more » ... . Our proof uses a simple connection between this problem and graph coloring, and uses recent results of F urer on the inapproximability of the chromatic number. We also study the problem of designing a classi er with a single inequality that involves as few variables as possible, and point out certain aspects of the di culty of this problem.
doi:10.1137/s009753979630814x fatcat:b6cmmmjrnnba5isggx45igdrtm