Term Ranking for Clustering Web Search Results

Fatih Gelgi, Hasan Davulcu, Srinivas Vadrevu
2007 International Workshop on the Web and Databases  
Clustering web search engine results for ambiguous keyword searches poses unique challenges. First, we show that one cannot readily import the frequency based feature ranking to cluster the web search results as in the text document clustering. Next, we present TermRank, a variation of the PageRank algorithm based on a relational graph representation of the content of web document collections. TermRank achieves desirable ranking of discriminative terms higher than the ambiguous terms, and
more » ... g ambiguous terms higher than common terms. We experiment with two clustering algorithms to demonstrate the efficacy of TermRank. TermRank is shown to perform substantially better than frequency based classical methods.
dblp:conf/webdb/GelgiDV07 fatcat:teurefutwjbcfnkxlqti5u4frq