Pairwise clustering based on the mutual-information criterion

Amir Alush, Avishay Friedman, Jacob Goldberger
2016 Neurocomputing  
Pairwise clustering methods partition a dataset using pairwise similarity between data-points. The pairwise similarity matrix can be used to define a Markov random walk on the data points. This view forms a probabilistic interpretation of spectral clustering methods. We utilize this probabilistic model to define a novel clustering cost function that is based on maximizing the mutual information between consecutively visited clusters of states of the Markov chain defined by the similarity
more » ... This cost function can be viewed as an extension of the information-bottleneck principle to the case of pairwise clustering. We show that the complexity of a sequential clustering implementation of the suggested cost function is linear in the dataset size on sparse graphs. The improved performance and the reduced computational complexity of the proposed algorithm are demonstrated on several standard datasets and on image segmentation task.
doi:10.1016/j.neucom.2015.12.025 fatcat:sf5pnf47ove7rlzfa6vifrceuy