Keywords: machine fairness, intersectional fairness, bias mitigation, fair learning, knowledge distillation
Abstract: Bias mitigation algorithms aim to reduce the performance disparity between different protected groups. Existing techniques focus on settings where there is a small number of protected groups arising from a single protected attribute, such as skin color, gender or age. In real-world applications, however, there are multiple protected attributes yielding a large number of intersectional protected groups. These intersectional groups are particularly prone to severe underrepresentation in datasets. We conduct the first thorough empirical analysis of how existing bias mitigation methods scale to this setting, using large-scale datasets including the ImageNet People Subtree and CelebA. We find that as more protected attributes are introduced to a task, it becomes more important to leverage the protected attribute labels during training to promote fairness. We also find that the use of knowledge distillation, in conjunction with group-specific models, can help scale existing fair learning methods to hundreds of protected intersectional groups and reduce bias. We show on ImageNet's People Subtree that combining these insights can further reduce the bias amplification of fair learning algorithms by 15% ---a surprising reduction given that the dataset has 196 protected groups but fewer than 10% of the training dataset has protected attribute labels.
One-sentence Summary: We scale bias mitigations and fair learning techniques to hundreds of intersectional protected groups using knowledge distillation and group-specific predictors.
Supplementary Material: zip
10 Replies
Loading