Accelerating Spectral Clustering under Fairness Constraints

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We improve the computation time for fair spectral clustering thanks to a new difference-of-convex formulation.
Abstract: Fairness of decision-making algorithms is an increasingly important issue. In this paper, we focus on spectral clustering with group fairness constraints, where every demographic group is represented in each cluster proportionally as in the general population. We present a new efficient method for fair spectral clustering (Fair SC) by casting the Fair SC problem within the difference of convex functions (DC) framework. To this end, we introduce a novel variable augmentation strategy and employ an alternating direction method of multipliers type of algorithm adapted to DC problems. We show that each associated subproblem can be solved efficiently, resulting in higher computational efficiency compared to prior work, which required a computationally expensive eigendecomposition. Numerical experiments demonstrate the effectiveness of our approach on both synthetic and real-world benchmarks, showing significant speedups in computation time over prior art, especially as the problem size grows. This work thus represents a considerable step forward towards the adoption of fair clustering in real-world applications.
Lay Summary: Clustering is a technique that helps computers automatically group similar data points: for example, organizing patients based on medical profiles or categorizing users by purchasing preferences. However, in applications involving sensitive attributes such as healthcare or hiring, it's important that these groupings treat different demographic groups fairly. Our work focuses on spectral clustering, a widely used method that analyzes relationships between data points using network structures. Existing approaches to enforcing fairness in spectral clustering rely on computationally intensive operations, which makes them unsuitable for large-scale datasets. We introduce a new, efficient algorithm that achieves fair clustering without the need for expensive computations. By reformulating the problem using a mathematical framework known as "difference of convex functions," and by designing a tailored optimization method, we significantly reduce the computational cost. This advancement enables fair clustering to be applied more broadly in real-world scenarios, such as healthcare, education, and public policy, where fairness and efficiency are both critical.
Primary Area: Social Aspects->Fairness
Keywords: fairness, spectral clustering, difference of convex, scalability
Submission Number: 12277
Loading