Word space models of lexical variation

Yves Peirsman, Dirk Speelman
2009 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics - GEMS '09   unpublished
In the recognition of words that are typical of a specific language variety, the classic keyword approach performs rather poorly. We show how this keyword analysis can be complemented with a word space model constructed on the basis of two corpora: one representative of the language variety under investigation, and a reference corpus. This combined approach is able to recognize the markers of a language variety as words that not only have a significantly higher frequency as compared to the
more » ... ence corpus, but also a different distribution. The application of word space models moreover makes it possible to automatically discover the lexical alternative to a specific marker in the reference corpus.
doi:10.3115/1705415.1705417 fatcat:4iwdwd6mqre55dy2vmdyaxa7xe