Wordrank: A Method for Ranking Web Pages Based on Content Similarity

Apostolos Kritikopoulos, Martha Sideri, Iraklis Varlamis
2007 24th British National Conference on Databases (BNCOD'07)  
This paper presents WordRank, a new page ranking system, which exploits similarity between interconnected pages. WordRank introduces the model of the 'biased surfer' which is based on the following assumption: "the visitor of a web page tends to visit web pages with similar content rather than content irrelevant pages". The algorithm modifies the random surfer model by biasing the probability of a user to follow a link in favor of links to pages with similar content. It is our intuition that
more » ... dRank is most appropriate in topic based searches, since it prioritizes strongly interconnected pages, and in the same time is more robust to the multitude of topics and to the noise produced by navigation links. This paper presents preliminary experimental evidence from a search engine we developed for the Greek fragment of the worldwide Web. For evaluation purposes, we introduce a new metric (SI score) which is based on implicit user's feedback, but we also employ explicit evaluation, where available.
doi:10.1109/bncod.2007.24 dblp:conf/bncod/KritikopoulosSV07 fatcat:mosxv5vj5rd7tbd4vdacdbvh3e