A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
Lecture Notes in Computer Science
This paper explores bridging the content of two different languages via latent topics. Specifically, we propose a unified probabilistic model to simultaneously model latent topics from bilingual corpora that discuss comparable content and use the topics as features in a cross-lingual, dictionary-less text categorization ask. Experimental results on multilingual Wikipedia data show that the proposed topic model effectively discover the topic information from the bilingual corpora, and thedoi:10.1007/978-3-642-20841-6_45 fatcat:75nuuftzqbaprmiz3n3giusdwi