Keywords: fairness model evaluation, fair deep learning, adversarial fairness
TL;DR: Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics
Abstract: With the recent expanding attention of machine learning researchers and practitioners to fairness, there is a void of a common framework to analyze and compare the capabilities of proposed models in deep representation learning. In this paper, we evaluate different fairness methods trained with deep neural networks on a common synthetic dataset and a real-world dataset to obtain better insights on how these methods work. In particular, we train about 3000 different models in various setups, including imbalanced and correlated data configurations, to verify the limits of the current models and better understand in which setups they are subject to failure. Our results show that the bias of models increase as datasets become more imbalanced or datasets attributes become more correlated, the level of dominance of correlated sensitive dataset features impact bias, and the sensitive information remains in the latent representation even when bias-mitigation algorithms are applied. Overall, we present a dataset, propose various challenging evaluation setups, and rigorously evaluate recent promising bias-mitigation algorithms in a common framework and publicly release this benchmark, hoping the research community would take it as a common entry point for fair deep learning.
Supplementary Material: zip