Clustering with Soft and Group Constraints [chapter]

Martin H. C. Law, Alexander Topchy, Anil K. Jain
2004 Lecture Notes in Computer Science  
Several clustering algorithms equipped with pairwise hard constraints between data points are known to improve the accuracy of clustering solutions. We develop a new clustering algorithm that extends mixture clustering in the presence of (i) soft constraints, and (ii) grouplevel constraints. Soft constraints can reflect the uncertainty associated with a priori knowledge about pairs of points that should or should not belong to the same cluster, while group-level constraints can capture larger
more » ... ilding blocks of the target partition when afforded by the side information. Assuming that the data points are generated by a mixture of Gaussians, we derive the EM algorithm to estimate the parameters of different clusters. Empirical study demonstrates that the use of soft constraints results in superior data partitions normally unattainable without constraints. Further, the solutions are more robust when the hard constraints may be incorrect.
doi:10.1007/978-3-540-27868-9_72 fatcat:g6acmcfsbnaspdtjtmkzglzwpu