Unifying Proportional Fairness in Centroid and Non-Centroid Clustering

Benjamin Cookson; Nisarg Shah; Ziqi Yu

Unifying Proportional Fairness in Centroid and Non-Centroid Clustering

Benjamin Cookson, Nisarg Shah, Ziqi Yu

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Proportional Fairness, Clustering, Algorithmic Fairness

TL;DR: We design proportionally fair clustering methods when each agent's loss function is determined by both its distance from the other agents in its cluster and to a representative agent in its cluster.

Abstract: Proportional fairness criteria inspired by democratic ideals of proportional representation have received growing attention in the clustering literature. Prior work has investigated them in two separate paradigms. Chen et al. [ICML 2019] study _centroid clustering_, in which each data point's loss is determined by its distance to a representative point (centroid) chosen in its cluster. Caragiannis et al. [NeurIPS 2024] study _non-centroid clustering_, in which each data point's loss is determined by its maximum distance to any other data point in its cluster. We generalize both paradigms to introduce _semi-centroid clustering_, in which each data point's loss is a combination of its centroid and non-centroid losses, and study two proportional fairness criteria---the core and, its relaxation, fully justified representation (FJR). Our main result is a novel algorithm which achieves a constant approximation to the core, in polynomial time, even when the distance metrics used for centroid and non-centroid loss measurements are different. We also derive improved results for more restricted loss functions and the weaker FJR criterion, and establish lower bounds in each case.

Supplementary Material: zip

Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)

Submission Number: 22067

Loading