Bag-of-Phrases (BoPh) and sentiment analysis of Arabic text in Twitter

Hamoud H Alshammari
2020 Indian Journal of Science and Technology  
Background/Objectives: Sentiment analysis plays main role in various text mining problems. Although, the Arabic text mining is important especially in the field of sentiment analysis, there is a paucity of research in it, especially, when it plays an important role in different issues in Arabic countries. Arabic language has many dialects that people use to express their feelings in social media. The objective of this study is to perform an experiment that follow the subjective opinion from the
more » ... text. Subjective Analysis is one way that we can implement to improve the accuracy of the sentiment results in such texts in some dialects, that hide various meanings behind the words such as Saudi dialect. Methods/Statistical analysis: In this study, we manually annotated more than 8,000 tweets to have training and testing data sets with positive or negative words and phrases. Then we proposed a "Bag of Phrases" methodology to analyze the sentiments in the texts, which helped to improve the performance of sentiment analysis. Since using bag of words method is not enough in many cases, we applied a Naive Bayes algorithm to test our method. Findings: The results show that the accuracy of having True positive or True negative is about 84% comparing by using manual annotation process. The accuracy is calculated after taking into consideration the margin of error due to the manual annotation step and subjective interpretation of the texts by the annotators. Novelty/Applications: The novelty of the study is having more accurate training data set comparing with the other works in Saudi dialect for Arabic text, and proposing the BoPh concept.
doi:10.17485/ijst/v13i40.1202 fatcat:2xeedznfabhxlafriepv7hlqie