Arabic Sentiment Analysis with Optimal Combination of Features Selection and Machine Learning Approaches

Bilal Sabri, Saidah Saad
2016 Research Journal of Applied Sciences Engineering and Technology  
The main objective of this research study is to design a model that allows for the utilization of a novel technique for the implementation of sentiment analysis in the Arabic language. Sentiment analysis is an interesting task that includes web mining, Natural Language Processing (NLP) and Machine Learning (ML). Most of the research work on sentiment analysis was focused on the texts in the English language. Therefore, the research on sentiment analysis in the Arabic language and other
more » ... are in the infancy stage. This study empirically evaluates three Feature Selection Methods (FSM) (Information Gain (IG), Chi-square (CHI) and Gini Index (GI)) and, three classification approaches (Association Rule (AR) mining and the N-gram model and the Meta-classifier approach) for the implementation of sentiment classification in the Arabic language. A number of related experiments have been carried out on the Opinion Corpus of Arabic (OCA). The results obtained from the experiments were favorable, depending on the algorithms used and the number of selected feature has proven that the use of FS method can increase the performance of sentiment classification in the Arabic language. The results of the experiments reveal that FS method is obtained to develop the classifier performance. Furthermore, the results of the experiment indicated that the use of CHI feature selection has produced the best performance for FS and the performance of meta-classifier a combination approach has outperformed the other approaches for sentiment classification in the Arabic language. In conclusion, this research study has proven that the combination approach (meta-classifier) with the chi-square FS method produces the most accurate classification technique, as high as 90.80%.
doi:10.19026/rjaset.13.2956 fatcat:shy532foynf3bndoegm5qf5wxm