A Comprehensive Analysis of PMI-based Models for Measuring Semantic Differences

Taichi Aida, Mamoru Komachi, Toshinobu Ogiso, Hiroya Takamura, Daichi Mochihashi
2021 Pacific Asia Conference on Language, Information and Computation  
The task of detecting words with semantic differences across corpora is mainly addressed by word representations such as word2vec or BERT. However, in the real world where linguists and sociologists apply these techniques, computational resources are typically limited. In this paper, we extend an existing simultaneously optimized model that can be trained on CPU to perform this task. Experimental results show that the extended models achieved comparable or superior results to strong baselines
more » ... English corpora and SemEval-2020 Task 1, and also in Japanese. Furthermore, we compared the training time of each model and conducted a comprehensive analysis of Japanese corpora. 1
dblp:conf/paclic/AidaKOTM21 fatcat:qecopkctzvbpbkbsqeuoa47rya