Constant factor approximations for Lower and Upper bounded Clusterings [article]

Neelima Gupta, Sapna Grover, Rajni Dabas
2022 arXiv   pre-print
Clustering is one of the most fundamental problem in Machine Learning. Researchers in the field often require a lower bound on the size of the clusters to maintain anonymity and upper bound for the ease of analysis. Specifying an optimal cluster size is a problem often faced by scientists. In this paper, we present a framework to obtain constant factor approximations for some prominent clustering objectives, with lower and upper bounds on cluster size. This enables scientists to give an
more » ... ate cluster size by specifying the lower and the upper bounds for it. Our results preserve the lower bounds but may violate the upper bound a little. when either of the bounds is uniform. We apply our framework to give the first constant factor approximations for LUkM and its generalization, k-facility location problem (LUkFL), with β+1 factor violation in upper bounds where β is the violation of upper bounds in solutions of upper bounded k-median and k-facility location problems respectively. We also present a result on LUkC with uniform upper bounds and, its generalization, lower and (uniform) upper bounded k supplier problem (LUkS). The approach also gives a result on lower and upper bounded facility location problem (LUFL), improving upon the upper bound violation of 5/2 due to Gupta et al. We also reduce the violation in upper bounds for a special case when the gap between the lower and upper bounds is not too small.
arXiv:2203.14058v1 fatcat:7kilgacltbh4ldiywf2brgbtvi