Models for retrieval with probabilistic indexing

Norbert Fuhr
1989 Information Processing & Management  
in this article three retrieval models for probabilistic indexing are described along with evaluation results for each. First is the binary independence indexing @II) model, which is a generalized version of the Maron and Kuhns indexing model. In this model, the indexing weight of a descriptor in a document is an estimate of the probability of relevance of this document with respect to queries using this descriptor. Second is the retrieval-with-probabilistic-indexing (RPI) model, which is
more » ... to different kinds of probabilistic indexing. For that we assume that each indexing scheme has its own concept of "correctness" to which the probabilities relate. In addition to the probabilistic indexing weights, the RPI model provides the possibility of reIevance weighting of search terms. A third mode1 that is similar was proposed by Croft some years ago as an extension of the binary independence retrieval model but it can be shown that this model is not based on the probabilistic ranking principle. The probabilistic indexing weights required for any of these models can be provided by an application of the Darmstadt indexing approach (DIA) for indexing with descriptors from a controlled vocabu-Iary. The experimental results show signi~cant improvements over retrieval with binary indexing. Finally, suggestions are made regarding how the DIA can be applied to probabilistic indexing with free text terms.
doi:10.1016/0306-4573(89)90091-5 fatcat:lkf75n35k5hkvl4uqxzrsfalgy