Mining Translations of Chinese Names from Web Corpora Using a Query Expansion Technique and Support Vector Machine

Kai-Hsiang Yang, Wei-Da Chen, Hahn-Ming Lee, Jan-Ming Ho
2007 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops  
Chinese name translation is a special case of the problem of named entity translation. It is a very challenging problem because there exist many kinds of Romanization systems and some people like to add additional words into their English names. Translating a scholar's name to its corresponding English name could help find information about his academic achievements. In this paper, we provide a classification for Chinese names, and propose a novel approach to mining Chinese name translations
more » ... m Web corpora. Our approach is based on three kinds of features, namely the phonetic similarity, the smallest distance, and the number of appearances in the neighborhood, to extract name translation candidates by using a query expansion technique and Support Vector Machine (SVM). Experimental results show that our approach can correctly translate the majority of Chinese names.
doi:10.1109/wiiatw.2007.4427644 dblp:conf/iat/YangCLH07 fatcat:n7rgywranjbwjeu7qtngdshk2q