Optimization of Inter-group criteria for clustering with minimum size constraints

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Single Link, clustering, approximation algorithms, complexity, inter-group criterion
TL;DR: We present algorithms for two natural intra-clustering criteria that work well on constrained cases in which all groups must have a minimum number of elements.
Abstract: Internal measures that are used to assess the quality of a clustering usually take into account intra-group and/or inter-group criteria. There are many papers in the literature that propose algorithms with provable approximation guarantees for optimizing the former. However, the optimization of inter-group criteria is much less understood. Here, we contribute to the state-of-the-art of this literature by devising algorithms with provable guarantees for the maximization of two natural inter-group criteria, namely the minimum spacing and the minimum spanning tree spacing. The former is the minimum distance between points in different groups while the latter captures separability through the cost of the minimum spanning tree that connects all groups. We obtain results for both the unrestricted case, in which no constraint on the clusters is imposed, and for the constrained case where each group is required to have a minimum number of points. Our constraint is motivated by the fact that the popular Single-Linkage, which optimizes both criteria in the unrestricted case, produces clustering with many tiny groups. To complement our work, we present an empirical study with 10 real datasets that provides evidence that our methods work very well in practical settings.
Submission Number: 1468
Loading