Random walk term weighting for information retrieval

Roi Blanco, Christina Lioma
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms. We use the random walk graphbased ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to
more » ... the traditional tf·idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.
doi:10.1145/1277741.1277930 dblp:conf/sigir/BlancoL07 fatcat:5uolqrtyane27dw6lpoemf25ra