A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora
2012
Information retrieval (Boston)
In this paper, we study different applications of cross-language latent topic models trained on comparable corpora. The first focus lies on the task of cross-language information retrieval (CLIR). The Bilingual Latent Dirichlet Allocation model (BiLDA) allows us to create an interlingual, language-independent representation of both queries and documents. We construct several BiLDA-based document models for CLIR, where no additional translation resources are used. The second focus lies on the
doi:10.1007/s10791-012-9200-5
fatcat:ednqhlfih5dcphmyagg4hzz37i