An Effective Approach for Imbalanced Classification: Unevenly Balanced Bagging

Guohua Liang, Anthony Cohn
2013 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Learning from imbalanced data is an important problem in data mining research. Much research has addressed the problem of imbalanced data by using sampling methods to generate an equally balanced training set to improve the performance of the prediction models, but it is unclear what ratio of class distribution is best for training a prediction model. Bagging is one of the most popular and effective ensemble learning methods for improving the performance of prediction models; however, there is
more » ... major drawback on extremely imbalanced data-sets. It is unclear under which conditions bagging is outperformed by other sampling schemes in terms of imbalanced classification. These issues motivate us to propose a novel approach, unevenly balanced bagging (UBagging) to boost the performance of the prediction model for imbalanced binary classification. Our experimental results demonstrate that UBagging is effective and statistically significantly superior to single learner decision trees J48 (SingleJ48), bagging, and equally balanced bagging (BBagging) on 32 imbalanced data-sets.
doi:10.1609/aaai.v27i1.8536 fatcat:m6nk26ld25fs5dzll5ysirsx74