Representation of Word Sentiment, Idioms and Senses

Giuseppe Attardi
2015 Italian Information Retrieval Workshop  
Distributional Semantic Models (DSM) that represent words as vectors of weights over a high dimensional feature space have proved very effective in representing semantic or syntactic word similarity. For certain tasks however it is important to represent contrasting aspects such as polarity, different senses or idiomatic use of words. We present two methods for creating embeddings that take into account such characteristics: a feed-forward neural network for learning sentiment specific and a
more » ... p-gram model for learning sense specific embeddings. Sense specific embeddings can be used to disambiguate queries and other classification tasks. We present an approach for recognizing idiomatic expressions by means of the embeddings. This can be used to segment queries into meaningful chunks. The implementation is available as a library implemented in Python with core numerical processing written in C++, using a parallel linear algebra library for efficiency and scalability.
dblp:conf/iir/Attardi15 fatcat:cjlnaqbigzhxje63sbizlqxyba