Morphological Word-Embeddings

Ryan Cotterell, Hinrich Schütze
2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
Linguistic similarity is multi-faceted. For instance, two words may be similar with respect to semantics, syntax, or morphology inter alia. Continuous word-embeddings have been shown to capture most of these shades of similarity to some degree. This work considers guiding word-embeddings with morphologically annotated data, a form of semisupervised learning, encouraging the vectors to encode a word's morphology, i.e., words close in the embedded space share morphological features. We extend the
more » ... log-bilinear model to this end and show that indeed our learned embeddings achieve this, using German as a case study.
doi:10.3115/v1/n15-1140 dblp:conf/naacl/CotterellS15 fatcat:7mlw2fkixjambotlpwhts77cse