Reexamining the cluster hypothesis

Marti A. Hearst, Jan O. Pedersen
1996 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '96  
We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone. This result provides evidence validating the cluster hypothesis which states that relevant documents tend to be more similar to each other than to non-relevant documents. We describe a system employing
more » ... ather and demonstrate that users are able to use this system close to its full potential.
doi:10.1145/243199.243216 dblp:conf/sigir/HearstP96 fatcat:r3xc77iwdjgq7krnpwwojuqaly