Fair Bayesian Model-Based Clustering

ICLR 2026 Conference Submission18097 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Algorithmic fairness, Clustering, Bayesian inference
Abstract: Fair clustering has become a socially significant task with the advancement of machine learning and the growing demand for trustworthy AI. Group fairness ensures that the proportions of each sensitive group are similar in all clusters. Most existing fair clustering methods are based on the $K$-means clustering and thus require the distance between instances and the number of clusters to be given in advance. To resolve this limitation, we propose a fair Bayesian model-based clustering called Fair Bayesian Clustering (FBC). We develop a specially designed prior which puts its mass only on fair clusters, and implement an efficient MCMC algorithm. The main advantage of FBC is its flexibility in the sense that it can infer the number of clusters, can process data where the choice of a reasonable distance is difficult (e.g., categorical data), and can reflect a constraint on the sizes of each cluster. We illustrate these advantages by analyzing real-world datasets.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 18097
Loading