Local neighbourhood extension of SMOTE for mining imbalanced data

Tomasz Maciejewski, Jerzy Stefanowski
2011 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)  
In this paper we discuss problems of inducing classifiers from imbalanced data and improving recognition of minority class using focused resampling techniques. We are particularly interested in SMOTE over-sampling method that generates new synthetic examples from the minority class between the closest neighbours from this class. However, SMOTE could also overgeneralize the minority class region as it does not consider distribution of other neighbours from the majority classes. Therefore, we
more » ... oduce a new generalization of SMOTE, called LN-SMOTE, which exploits more precisely information about the local neighbourhood of the considered examples. In the experiments we compare this method with original SMOTE and its two, the most related, other generalizations Borderline and Safe-Level SMOTE. All these pre-processing methods are applied together with either decision tree or Naive Bayes classifiers. The results show that the new LN-SMOTE method improves evaluation measures for the minority class. 978-1-4244-9925-0/11/$26.00 ©2011 IEEE
doi:10.1109/cidm.2011.5949434 dblp:conf/cidm/MaciejewskiS11 fatcat:ddx6ju3xrzglfd4q4nmxagueuq