A search result clustering method using informatively named entities

Hiroyuki Toda, Ryoji Kataoka
2005 Proceedings of the seventh ACM international workshop on Web information and data management - WIDM '05  
Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index means a structured label list that can makes it easier for the user to comprehend the labels and search results. To realize this goal, we make three proposals. First is to use Named Entity Extraction for term extraction. Second is a new label selecting criterion based on importance in the search result and the
more » ... between terms and search queries. The third is label categorization using category information of labels, which is generated by NE extraction. We implement a prototype system based on these proposals and find that it offers much higher performance than existing methods; we focus on news articles in this paper.
doi:10.1145/1097047.1097063 dblp:conf/widm/TodaK05 fatcat:7tmibgnfgnezvatev7uz7hromi