Language Modeling Approaches to Information Retrieval

Protima Banerjee, Hyo-Il Han
2009 Journal of Computing Science and Engineering  
This article surveys recent research in the area of language modeling (sometimes called statistical language modeling) approaches to information retrieval. Language modeling is a formal probabilistic retrieval framework with roots in speech recognition and natural language processing. The underlying assumption of language modeling is that human language generation is a random process; the goal is to model that process via a generative statistical model. In this article, we discuss current
more » ... ch in the application of language modeling to information retrieval, the role of semantics in the language modeling framework, cluster-based language models, use of language modeling for XML retrieval and future trends.
doi:10.5626/jcse.2009.3.3.143 fatcat:ixxzdlrevfd35jemld6vfdrndi