Best and Fairest: An Empirical Analysis of Retrieval System Bias [chapter]

Colin Wilkie, Leif Azzopardi
2014 Lecture Notes in Computer Science  
In this paper, we explore the bias of term weighting schemes used by retrieval models. Here, we consider bias as the extent to which a retrieval model unduly favours certain documents over others because of characteristics within and about the document. We set out to find the least biased retrieval model/weighting. This is largely motivated by the recent proposal of a new suite of retrieval models based on the Divergence From Independence (DFI) framework. The claim is that such models provide
more » ... e fairest term weighting because they do not make assumptions about the term distribution (unlike most other retrieval models). In this paper, we empirically examine whether fairness is linked to performance and answer the question; is fairer better? 1 In [12] , it was shown that Language Modelling with Jelinek-Mercer smoothing provides a probabilistic justification for TF.IDF.
doi:10.1007/978-3-319-06028-6_2 fatcat:w3uigrxom5c4vnsaepdgnpbli4