On Combining Link and Contents Information for Web Page Clustering [chapter]

Yitong Wang, Masaru Kitsuregawa
2002 Lecture Notes in Computer Science  
Clustering is currently one of the most crucial techniques for dealing (e.g. resources locating, information interpreting) with massive amount of heterogeneous information on the web, which is beyond human being's capacity to digest. In this paper, we discuss the shortcomings of pervious approaches and present a unifying clustering algorithm to cluster web search results for a specific query topic by combining link and contents information. Especially, we investigate how to combine link and
more » ... ents analysis in clustering process to improve the quality and interpretation of web search results .The proposed approach automatically clusters the web search results into high quality, semantically meaningful groups in a concise, easy-to-interpret hierarchy with tagging terms. Preliminary experiments and evaluations are conducted and the experimental results show that the proposed approach is effective and promising.
doi:10.1007/3-540-46146-9_89 fatcat:qjog7nqpi5fvlotxep5menr5pm