Cross-lingual knowledge linking across wiki knowledge bases

Zhichun Wang, Juanzi Li, Zhigang Wang, Jie Tang
2012 Proceedings of the 21st international conference on World Wide Web - WWW '12  
Wikipedia becomes one of the largest knowledge bases on the Web. It has attracted 513 million page views per day in January 2012. However, one critical issue for Wikipedia is that articles in different language are very unbalanced. For example, the number of articles on Wikipedia in English has reached 3.8 million, while the number of Chinese articles is still less than half million and there are only 217 thousand cross-lingual links between articles of the two languages. On the other hand,
more » ... e are more than 3.9 million Chinese Wiki articles on Baidu Baike and Hudong.com, two popular encyclopedias in Chinese. One important question is how to link the knowledge entries distributed in different knowledge bases. This will immensely enrich the information in the online knowledge bases and benefit many applications. In this paper, we study the problem of cross-lingual knowledge linking and present a linkage factor graph model. Features are defined according to some interesting observations. Experiments on the Wikipedia data set show that our approach can achieve a high precision of 85.8% with a recall of 88.1%. Knowledge Linking, Cross-lingual, Wiki knowledge base, Knowledge sharing • Linguistics. Existing methods for finding cross-lingual links heavily depend on translation tools. Such a method often results in high precisions, but low recalls. Can we find some language-independent features for mining cross-lingual knowledge links? • Model. There are different kinds of information that
doi:10.1145/2187836.2187899 dblp:conf/www/WangLWT12 fatcat:s6ru5qwb7nfddjvbmgd44ceo5e