Learning with small disjuncts

Gary M. Weiss
1995
Systems that learn from examples often create a disjunctive concept definition. The disjuncts in the concept definition which cover only a few training examples are referred to as small disjuncts. The problem with small disjuncts is that they are more error prone than large disjuncts, but may be necessary to achieve a high level of predictive accuracy [Holte, Acker, and Porter, 1989].This paper extends previous work done on the problem of small disjuncts by investigating the reasons why small
more » ... sjuncts are more error prone than large disjuncts, and evaluating the impact small disjuncts have on inductive learning. This paper shows that attribute noise, missing attributes, class noise, and training set size can each cause small disjuncts to be more error prone than large disjuncts. This paper also evaluates the impact that these factors have on learning with small disjuncts (i.e., on the error rate). It shows, for two artificial domains, that when low levels of attribute noise are applied only to the training set (the ability to learn the correct noise-free concept is being evaluated), small disjuncts are primarily responsible for making learning difficult.
doi:10.7282/t3-vem5-z794 fatcat:xufpn2peyvgy7fyywchlu36abq