Clustering under Perturbation Stability in Near-Linear Time [article]

Pankaj K. Agarwal and Hsien-Chih Chang and Kamesh Munagala and Erin Taylor and Emo Welzl
2020 arXiv   pre-print
We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is α-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most α. Our main contribution is in presenting efficient exact algorithms for α-stable clustering instances whose running times depend near-linearly on the size of the data set when α≥ 2 + √(3). For
more » ... enter and k-means problems, our algorithms also achieve polynomial dependence on the number of clusters, k, when α≥ 2 + √(3) + ϵ for any constant ϵ > 0 in any fixed dimension. For k-median, our algorithms have polynomial dependence on k for α > 5 in any fixed dimension; and for α≥ 2 + √(3) in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.
arXiv:2009.14358v1 fatcat:7xjoqmce4ra2plk3xb6xda7kee