The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

Kashyap Popat, Balamurali A. R., Pushpak Bhattacharyya, Gholamreza Haffari
2013 Annual Meeting of the Association for Computational Linguistics  
Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means of syntagma based word clustering. In this paper, the problem of data sparsity in sentiment analysis, both monolingual and cross-lingual, is addressed through the means of clustering. Experiments show
more » ... that cluster based data sparsity reduction leads to performance better than sense based classification for sentiment analysis at document level. Similar idea is applied to Cross Lingual Sentiment Analysis (CLSA), and it is shown that reduction in data sparsity (after translation or bilingual-mapping) produces accuracy higher than Machine Translation based CLSA and sense based CLSA.
dblp:conf/acl/PopatRBH13 fatcat:5dedjey5evhb3fem2lr6jjqkke