Education Data Mining Application for Predicting Students' Achievements of Portuguese Using Ensemble Model

Shuai Zhang, Jie Chen, Wenyu Zhang, Qiwei Xu, Jiaxuan Shi
2021 Science Journal of Education  
With the emergence of the massive educational data, education data mining techniques have extensively drawn considerable interest from scholars to explore the relationship between students' achievements and other factors. In this study, the data set about the students' achievements of Portuguese in two secondary education schools in Portugal is selected for education data mining, which involves the personal information, social and school related factors. To analyze the relationship between the
more » ... tudents' achievements and other factors, this study proposed an ensemble model based on weighted voting for predicting the students' achievements of Portuguese in the final period. First, the raw data is preprocessed using some basic methods, including dummy coding, correlation analysis, standardization, and normalization. Second, the isolation forest algorithm-based outlier adaption is applied to deal with the data set to enhance the robustness of the ensemble model. Finally, two base classifiers, i.e. gradient boosting decision tree and extreme gradient boosting, are integrated to form the ensemble model. The experiments are presented for verifying the superiority of the proposed model by comparing with five base classifiers, including gradient boosting decision tree, adaptive boosting, extreme gradient boosting, random forest, and decision tree. The experimental results demonstrate that the ensemble model performs better than other base classifiers in classification, and prove the validity of the outlier adaption based on isolation forest algorithm.
doi:10.11648/j.sjedu.20210902.16 fatcat:ggxhkbafgjayvpp2f3ui22r2fa