A Hybrid Framework for Sentiment Analysis using Genetic Algorithm based Feature Reduction

Farkhund Iqbal, Jahanzeb Maqbool, Benjamin C. M. Fung, Rabia Batool, Asad Masood Khattak, Saiqa Aleem, Patrick C. K. Hung
2019 IEEE Access  
Due to the rapid development of Internet technologies and social media, sentiment analysis has become an important opinion mining technique. Recent research work has described the effectiveness of different sentiment classification techniques ranging from simple rule-based and lexicon-based approaches to more complex machine learning algorithms. While lexicon-based approaches have suffered from the lack of dictionaries and labeled data, machine learning approaches have fallen short in terms of
more » ... ccuracy. This paper proposes an integrated framework which bridges the gap between lexicon-based and machine learning approaches to achieve better accuracy and scalability. To solve the scalability issue that arises as the feature-set grows, a novel genetic algorithm (GA)-based feature reduction technique is proposed. By using this hybrid approach, we are able to reduce the feature-set size by up to 42% without compromising the accuracy. The comparison of our feature reduction technique with more widely used principal component analysis (PCA) and latent semantic analysis (LSA) based feature reduction techniques have shown up to 15.4% increased accuracy over PCA and up to 40.2% increased accuracy over LSA. Furthermore, we also evaluate our sentiment analysis framework on other metrics including precision, recall, F-measure, and feature size. In order to demonstrate the efficacy of GA-based designs, we also propose a novel cross-disciplinary area of geopolitics as a case study application for our sentiment analysis framework. The experiment results have shown to accurately measure public sentiments and views regarding various topics such as terrorism, global conflicts, and social issues. We envisage the applicability of our proposed work in various areas including security and surveillance, law-and-order, and public administration. INDEX TERMS Classifier, feature optimization, genetic algorithm, machine learning, sentiment analysis.
doi:10.1109/access.2019.2892852 fatcat:5ixcf35l2zdk5m32gdfgtyxvvu