Overfit prevention in adaptive weighted distance nearest neighbor

Elham Parvinnia, Mohammad R. Moosavi, Mansoor Z. Jahromi, Koorush Ziarati
2011 Procedia Computer Science  
Nearest-Neighbor classification was developed to perform discriminant analysis when reliable parametric estimates of probability densities are unknown or difficult to determine. The major disadvantages of NN are its sensitivity to the distance function and using all training instances in the generalization phase. This can cause slow execution speed and high storage requirement when dealing with large data sets. In our past research, an adaptive distance weighted nearest neighbor algorithm
more » ... was proposed that tackled both of these problems. WDNN assigns a weight to each training instance that is used in the generalization phase to calculate the distance (or similarity) of a query pattern to that instance. The most disadvantage of WDNN is its early overfit to train data. In the scheme proposed in this paper, we improved WDNN to prevent the overfitting creation and at the same time saved the advantages of the old version of WDNN. For this purpose, the weight of a training instance is updated only if it is effective in the classification of several instances. By this way, the border of classes will be simple and the overfitting is prevented. Experimental results confirm the overfitting prevention.
doi:10.1016/j.procs.2011.01.001 fatcat:zy7ctvmrzrbzdlwfjvcbpjegqy