On the Lower Bound of Local Optimums in K-Means Algorithm

Zhenjie Zhang, Bing Dai, Anthony Tung
2006 IEEE International Conference on Data Mining. Proceedings  
The k-means algorithm is a popular clustering method used in many different fields of computer science, such as data mining, machine learning and information retrieval. However, the k-means algorithm is very likely to converge to some local optimum which is much worse than the desired global optimal solution. To overcome this problem, current k-means algorithm and its variants usually run many times with different initial centers to avoid being trapped in local optimums that are of unacceptable
more » ... quality. In this paper, we propose an efficient method to compute a lower bound on the cost of the local optimum from the current center set. After every k-means iteration, k-means algorithm can halt the procedure if the lower bound of the cost at the future local optimum is worse than the best solution that has already been computed so far. Although such a lower bound computation incurs some extra time consumption in the iterations, extensive experiments on both synthetic and real data sets show that this method can greatly prune the unnecessary iterations and improve the efficiency of the algorithm in most of the data sets, especially with high dimensionality and large k.
doi:10.1109/icdm.2006.118 dblp:conf/icdm/ZhangDT06 fatcat:6spyzoiccjfw5gwfs4g6ivxxae