Similarity-based clustering by left-stochastic matrix factorization

Raman Arora, Maya R. Gupta, Amol Kapila, Maryam Fazel
2013 Journal of machine learning research  
For similarity-based clustering, we propose modeling the entries of a given similarity matrix as the inner products of the unknown cluster probabilities. To estimate the cluster probabilities from the given similarity matrix, we introduce a left-stochastic non-negative matrix factorization problem. A rotation-based algorithm is proposed for the matrix factorization. Conditions for unique matrix factorizations and clusterings are given, and an error bound is provided. The algorithm is
more » ... y efficient for the case of two clusters, which motivates a hierarchical variant for cases where the number of desired clusters is large. Experiments show that the proposed left-stochastic decomposition clustering model produces relatively high within-cluster similarity on most data sets and can match given class labels, and that the efficient hierarchical variant performs surprisingly well.
dblp:journals/jmlr/AroraGKF13 fatcat:4xutprxsb5cbjana7allvwguti