Do We Really Achieve Fairness with Explicit Sensitive Attributes?

Do We Really Achieve Fairness with Explicit Sensitive Attributes?

TMLR Paper973 Authors

20 Mar 2023 (modified: 17 Sept 2024)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent research on fairness has shown that merely removing sensitive attributes from model inputs is not enough to achieve demographic parity, as non-sensitive attributes can still reveal sensitive information to varying extents. For instance, a person's ``race'' can be deduced from their ``zipcode'' to some extent. While current methods directly utilize explicit sensitive attributes (e.g., ``race'') to debias model predictions (e.g., obtained by ``zipcode''), they often fail to uphold demographic parity. This is especially true for high-sensitive samples, whose non-sensitive attributes are more likely to leak sensitive information than low-sensitive samples. This challenge stems from the model treating each sample with a specific sensitive attribute, while the prediction only incorporates partial sensitive information, leading to potential biases. This observation highlights the need for demographic parity measurements that account for the degree of sensitive information leakage in individual samples, and differentiate between samples with varying degrees of leakage. To address this issue, we introduce a new definition of group fairness measurement called $\alpha$-Demographic Parity, which ensures demographic parity for samples with differing degrees of sensitive information leakage. To achieve $\alpha$-Demographic Parity, we propose to directly promote the independence of model predictions from the distribution of sensitive information, rather than the specific sensitive attributes. This approach directly minimizes the Hilbert-Schmidt Independence Criterion between the two distributions, thereby ensuring more precise and fair predictions across all subgroups. Our proposed method outperforms existing approaches in achieving $\alpha$-Demographic Parity and demonstrates strong performance in scenarios with limited sensitive attribute information, as evidenced by extensive experiments. Our code is anonymously available at https://anonymous.4open.science/r/TMLR_STFS_code-2ED6

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Mingming_Gong1

Submission Number: 973

Loading