Cross-language and Cross-encyclopedia Article Linking Using Mixed-language Topic Model and Hypernym Translation

Yu-Chun Wang, Chun-Kai Wu, Richard Tzong-Han Tsai
2014 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
Creating cross-language article links among different online encyclopedias is now an important task in the unification of multilingual knowledge bases. In this paper, we propose a cross-language article linking method using a mixed-language topic model and hypernym translation features based on an SVM model to link English Wikipedia and Chinese Baidu Baike, the most widely used Wiki-like encyclopedia in China. To evaluate our approach, we compile a data set from the top 500 Baidu Baike articles
more » ... and their corresponding English Wiki articles. The evaluation results show that our approach achieves 80.95% in MRR and 87.46% in recall. Our method does not heavily depend on linguistic characteristics and can be easily extended to generate crosslanguage article links among different online encyclopedias in other languages.
doi:10.3115/v1/p14-2096 dblp:conf/acl/WangWT14 fatcat:7cxijk4ta5allj4n4564gho2ce