A Survey on Fairness Without Demographics

TMLR Paper2070 Authors

19 Jan 2024 (modified: 21 Apr 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: The issue of bias in Machine Learning (ML) models is a significant challenge for the machine learning community. Real-world biases can be embedded in the data used to train models, prior studies have shown that ML models can learn and even amplify these biases. This can result in unfair treatment of individuals based on their inherent characteristics or sensitive attributes such as gender, race, or age. With the increasing use of ML models in high-stakes scenarios, ensuring fairness is crucial and has gained significant attention from researchers in recent years. However, the challenge of ensuring fairness becomes much greater when the assumption of full access to sensitive attributes does not hold. The settings where the hypothesis does not hold include cases where (1) only limited or noisy demographic information is available, or (2) demographic information is entirely unobserved due to privacy restrictions. In this survey, we review recent research efforts aimed at ensuring fairness when sensitive attributes are missing. We propose a taxonomy of existing works, and more importantly, highlight current challenges and future research directions to stimulate research in ML fairness in the setting of missing sensitive attributes.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Bo_Li19
Submission Number: 2070
Loading