DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS

Yoshua Bengio, Olivier Delalleau, Clarence Simard
2010 Computational intelligence  
The family of decision tree learning algorithms is among the most widespread and studied. Motivated by the desire to develop learning algorithms that can generalize when learning highly varying functions such as those presumably needed to achieve artificial intelligence, we study some theoretical limitations of decision trees. We demonstrate formally that they can be seriously hurt by the curse of dimensionality in a sense that is a bit different from other nonparametric statistical methods,
more » ... most importantly, that they cannot generalize to variations not seen in the training set. This is because a decision tree creates a partition of the input space and needs at least one example in each of the regions associated with a leaf to make a sensible prediction in that region. A better understanding of the fundamental reasons for this limitation suggests that one should use forests or even deeper architectures instead of trees, which provide a form of distributed representation and can generalize to variations not encountered in the training data.
doi:10.1111/j.1467-8640.2010.00366.x fatcat:tqyvj6kr2bhm7ggra6hg7e3om4