An MCL-Based Text Mining Approach for Namesake Disambiguation on the Web

Tarique Anwar, Muhammad Abulaish
2012 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology  
In this paper, we propose a Markov CLustering (MCL) based text mining approach for namesake disambiguation on the Web. The novelty of the proposed technique lies in modeling the collection of webpages using a weighted graph structure and applying MCL to crystalize it into different clusters, each one containing the webpages related to a particular namesake individual. The proposed method focuses on three broad and realistic aspects to cluster webpages retrieved through search enginescontent
more » ... lapping, structure overlapping, and local context overlapping. The efficacy of the proposed method is demonstrated through experimental evaluations on standard datasets.
doi:10.1109/wi-iat.2012.239 dblp:conf/webi/AnwarA12 fatcat:gflbbvxnlrao7jwuqcllczvn7u