Personal Name Disambiguation in Web Search Results Using a Semi-Supervised Clustering Approach

Kazunari Sugiyama, Manabu Okumura
2009 Journal of Natural Language Processing  
Personal names are often submitted to search engines as query keywords. However, in response to a personal name query, search engines return a long list of search results that contains Web pages about several namesakes. In order to address this problem, most of the previous works that disambiguate personal names in Web search results often employ agglomerative clustering approaches. In contrast, we have adopted a semi-supervised clustering approach to integrate similar documents into a seed
more » ... nts into a seed document. Our proposed semi-supervised clustering approach is novel in that it controls the fluctuation of the centroid of a cluster.
doi:10.5715/jnlp.16.5_23 fatcat:lvb6jwgpqrdnfbakbdf7lpwqgu