Fast and Accurate Fair $k$-Center Clustering in Doubling Metrics

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: clustering, fairness, k-center, coresets, MapReduce, streaming
TL;DR: We provide fair k-center clustering algorithms which are scalable to large datasets from low-dimensional spaces, while maintaining the best approximations achievable by state of the art, non-scalable approaches.
Abstract: We study the classic $k$-center problem under the additional constraint that each cluster should be _fair_. In this setting, each point is marked with one or more _colors_, which can be used to model protected attributes (e.g., gender or ethnicity). A cluster is deemed _fair_ if, for every color, the fraction of its points marked with that color is within some prespecified range. We present a coreset-based approach to fair $k$-center for general metric spaces which attains almost the best approximation quality of the current state of the art solutions, while featuring running times which can be orders of magnitude faster for large datasets of low doubling dimension. We devise sequential, streaming and MapReduce implementations of our approach and conduct a thourough experimental analysis to provide evidence of their practicality, scalability, and effectiveness.
Track: Graph Algorithms and Learning for the Web
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: No
Submission Number: 1522
Loading