Clustering with Balancing Constraints [chapter]

Joydeep Ghosh, Ayhan Demiriz
2008 Constrained Clustering  
In many applications of clustering, solutions that are balanced, i.e, where the clusters obtained are of comparable sizes, are preferred. This chapter describes several approaches to obtaining balanced clustering results that also scale well to large data sets. First, we describe a general scalable framework for obtaining balanced clustering which first clusters only a small subset of the data and then efficiently allocates the rest of the data to these initial clusters while simultaneously
more » ... ning the clustering. Next, we discuss how frequency sensitive competitive learning can be used for balanced clustering in both batch and on-line scenarios, and illustrate the mechanism with a case study of clustering directional data such as text documents. Finally, we briefly outline balanced clustering based on other methods such as graph partitioning and mixture modeling.
doi:10.1201/9781584889977.ch8 fatcat:kj5gtm37ebbmtcvk3zw2dw2bde