Effect of word embedding vector dimensionality on sentiment analysis through short and long texts

Mohamed Chiny, Marouane Chihab, Abdelkarim Ait Lahcen, Omar Bencharef, Younes Chihab
2023 IAES International Journal of Artificial Intelligence (IJ-AI)  
<span lang="EN-US">Word embedding has become the most popular method of lexical description in a given context in the natural language processing domain, especially through the word to vector (Word2Vec) and global vectors (GloVe) implementations. Since GloVe is a pre-trained model that provides access to word mapping vectors on many dimensionalities, a large number of applications rely on its prowess, especially in the field of sentiment analysis. However, in the literature, we found that in
more » ... y cases, GloVe is implemented with arbitrary dimensionalities (often 300d) regardless of the length of the text to be analyzed. In this work, we conducted a study that identifies the effect of the dimensionality of word embedding mapping vectors on short and long texts in a sentiment analysis context. The results suggest that as the dimensionality of the vectors increases, the performance metrics of the model also increase for long texts. In contrast, for short texts, we recorded a threshold at which dimensionality does not matter.</span>
doi:10.11591/ijai.v12.i2.pp823-830 fatcat:wan6skypgzd3helpoagdfhugrm