A Review on Classification of Data Imbalance using BigData

Ramasubramanian, Hariharan Shanmugasundaram
2021 International Journal of Managing Information Technology  
Classification is one among the data mining function that assigns items in a collection to target categories or collection of data to provide more accurate predictions and analysis. Classification using supervised learning method aims to identify the category of the class to which a new data will fall under. With the advancement of technology and increase in the generation of real-time data from various sources like Internet, IoT and Social media it needs more processing and challenging. One
more » ... h challenge in processing is data imbalance. In the imbalanced dataset, majority classes dominate over minority classes causing the machine learning classifiers to be more biased towards majority classes and also most classification algorithm predicts all the test data with majority classes. In this paper, the author analysis the data imbalance models using big data and classification algorithm.
doi:10.5121/ijmit.2021.13302 fatcat:7v52ofngqvgyjarvlsrqqzoimm