Comparison of Short-Text Sentiment Analysis Methods for Croatian

Leon Rotim, Jan Šnajder
2017 Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing  
We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus-and preprocessing-free string kernels, and how these compare to bag-ofwords baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets,
more » ... d embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines.
doi:10.18653/v1/w17-1411 dblp:conf/acl-bsnlp/RotimS17 fatcat:rurxoto7qfhztpbarchzjlrpny