Clustering under Local Stability: Bridging the Gap between Worst-Case and Beyond Worst-Case Analysis [article]

Maria-Florina Balcan, Colin White
2017 arXiv   pre-print
Recently, there has been substantial interest in clustering research that takes a beyond worst-case approach to the analysis of algorithms. The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion. For example, Bilu and Linial (2010) and Awasthi et al. (2012) presented algorithms that output near-optimal solutions, assuming the optimal solution is preserved under small perturbations to the input distances. A
more » ... rawback to this approach is that the algorithms are often explicitly built according to the stability assumption and give no guarantees in the worst case; indeed, several recent algorithms output arbitrarily bad solutions even when just a small section of the data does not satisfy the given stability notion. In this work, we address this concern in two ways. First, we provide algorithms that inherit the worst-case guarantees of clustering approximation algorithms, while simultaneously guaranteeing near-optimal solutions when the data is stable. Our algorithms are natural modifications to existing state-of-the-art approximation algorithms. Second, we initiate the study of local stability, which is a property of a single optimal cluster rather than an entire optimal solution. We show our algorithms output all optimal clusters which satisfy stability locally. Specifically, we achieve strong positive results in our local framework under recent stability notions including metric perturbation resilience (Angelidakis et al. 2017) and robust perturbation resilience (Balcan and Liang 2012) for the k-median, k-means, and symmetric/asymmetric k-center objectives.
arXiv:1705.07157v1 fatcat:3vqzbbkx3vc2la6egthqj5dphi