The Improvement of C4.5 Algorithm Accuracy in Predicting Forest Fires Using Discretization and AdaBoost

Tomi Bagus Nugroho, Endang Sugiharti
2021 Journal of Advances in Information Systems and Technology  
Data mining is a process used to help analyze data obtained from certain circumstances with a mathematical approach. The decision tree is an algorithm that is often used in data mining. One of the Decision tree algorithms is the C4.5 algorithm. Data mining consists of preprocessing, data mining, pattern evaluation, and knowledge presentation in its application. Forest fire data used were taken from the UCI Machine Learning Repository. Data normalization, data transformation, and discretization
more » ... re used to preprocess data in research. To improve accuracy, the C4.5 algorithm can be combined with AdaBoost. This study aims to determine how the application of discretization to the C4.5 algorithm with AdaBoost predicts forest fires and determines the increase in its accuracy. Based on the results of ten k-fold cross-validations, the highest accuracy value obtained is 98.04%. The implementation of discretization and AdaBoost increased the accuracy of forest fire predictions by 13.42%.
doi:10.15294/jaist.v3i1.49094 fatcat:vzadyo6f2ndubjtvjgncp7j4o4