New Oversampling Approaches Based on Polynomial Fitting for Imbalanced Data Sets

Sami Gazzah, Najoua Essoukri Ben Amara
2008 2008 The Eighth IAPR International Workshop on Document Analysis Systems  
In classification tasks, class-modular strategy has been widely used. It has outperformed classical strategy for pattern classification task in many applications [1] . However, in some modular architecture, such as one against all in support vector machines classifier, the training dataset for one class risks to heavily outnumber the other classes. In this challenging situation, the trained classifier will accurately classify the majority class; nevertheless, it marginalizes the minority class.
more » ... As a result, True Negatives rate (TNr) will be very high while the True Positives rate (TPr) will be low. The main goal of this work is to improve TPr without much sacrifice in TNr. In this paper, we propose oversampling the minority class using polynomial fitting functions. Four new approaches were proposed: star topology, bus topology, polynomial curve topology and mesh topology. Star and mesh topologies approach had led to the best performances. The Eighth IAPR Workshop on Document Analysis Systems 978-0-7695-3337-7/08 $25.00
doi:10.1109/das.2008.74 dblp:conf/das/GazzahA08 fatcat:rkogfrydingllliujt7ki2nc5y