A Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets

Alberto Fernández, Cristobal José Carmona, María José del Jesus, Francisco Herrera
2017 International Journal of Neural Systems  
Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this research, we overcome these problems by carrying out a combination between feature and instance selection.
more » ... re selection will allow simplifying the overlapping areas easing the generation of rules to distinguish among the classes. Selection of instances from all classes will address the imbalance itself by finding the most appropriate class distribution for the learning task, as well as possibly removing noise and difficult borderline examples. For the sake of obtaining an optimal joint set of features and instances, we embedded the searching for both parameters in a Multi-Objective Evolutionary Algorithm, using the C4.5 decision tree as baseline classifier in this wrapper approach. The multi-objective scheme allows taking a double advantage: the search space becomes broader, and we may provide a set of different solutions in order to build an ensemble of classifiers. This proposal has been contrasted versus several state-of-the-art solutions on imbalanced classification showing excellent results in both binary and multi-class problems.
doi:10.1142/s0129065717500289 pmid:28633551 fatcat:3avzppodxrgptmymycxcwlzkqa