A Semantic Conceptualization using Tagged Bag-of-Concepts for Sentiment Analysis

Yassin S. Mehanna, M. Mahmuddin
2021 IEEE Access  
Sentiment could be expressed implicitly or explicitly in the text. Hence, it is the main challenge for current sentiment analysis (SA) approaches to identify hidden sentiments, other common challenges include false classification of opinion words, ignoring context information, and bad handling of a short text that arise from the bad interpretation of the text and lack of enough data required for analysis tasks. In this study, a semantic conceptualization method using tagged bag-of-concepts for
more » ... A is proposed to detect the correct sentiment towards the actual target entity that considers all affective and conceptual information conveyed in the text with a special focus on the short text. Tagged bag-of-concepts (TBoC) is a novel approach to analyze and decompose text to uncover latent sentiments while preserving all relations and vital information to boost the accuracy of SA. This study answers questions: Does the information provided via TBoC enhance sentiment classification results on different analysis levels? Is building a structure of concepts increases the accuracy of overall sentiment towards specific opinion target? Does TBoC approach enhance SA results for short text messages? The proposed solution has been applied on two datasets from the restaurant domain, sentiment analysis is performed using the TBoCs structure on multiple levels including document, aspect, aspect-category, and topic levels. TBoC method with domain-specific sentiment lexicon showed exceptional performance and outperformed other state-of-the-art NB, SVM, and NN methods, especially for aspect-level SA. The use of TBoC within the semantic conceptualization model that leverages NLP tasks, Ontology, and semantic methods proved its high capabilities for concept extraction while preserving the information about the context, interrelations, and latent feelings.
doi:10.1109/access.2021.3107237 fatcat:2kjmloselvgn5mmrfgt4xifjf4