Hierarchical pitman-yor language model for information retrieval

Saeedeh Momtazi, Dietrich Klakow
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distribution. The Pitman-Yor process creates a power-law distribution which is one of the statistical properties of word frequency in natural language. Our experiments on Ro-bust04 indicate that this model improves the document retrieval performance compared to the commonly used Dirichlet prior and absolute discounting smoothing techniques.
doi:10.1145/1835449.1835619 dblp:conf/sigir/MomtaziK10 fatcat:kcqypr4rrbbyvaxt6e77whra5y