Deep fair clustering with multi-level decorrelation

Published: 01 Jan 2024, Last Modified: 05 Mar 2025Inf. Sci. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Fair clustering aims to prevent sensitive attributes (e.g., race or gender) from dominating the clustering process. However, real-world datasets, often characterized by low quality and high dimensionality, restrict existing fair clustering methods from achieving satisfactory outcomes. Typically, these sensitive attributes are intricately intertwined with other attributes in high-dimensional continuous data, forming backgrounds or entities within the data. The integration results in a significant correlation of features and samples across different clusters, thereby hindering the model's ability to explore the intrinsic structure. To address these issues, we propose a novel fair clustering method that incorporates multi-level decorrelation constraints. Our goal is to extract inherent fair structural information under the interference of sensitive attributes, enhancing both the validity and fairness of the model. Specifically, we introduce a new cluster-wise similarity metric based on the partition correlation coefficient, which facilitates cluster-level decorrelation and captures cluster-discriminative properties. Furthermore, by incorporating softmax-formulated decorrelation at the sample-level and feature-level, we concurrently explore representations that favor fairness. These three components are seamlessly integrated into our clustering framework, yielding a more robust and confident data partition. Experiments conducted on six commonly-used datasets demonstrate the effectiveness of our proposed method.
Loading