TL;DR: Our work highlights the importance of careful subgroup definition in bias mitigation and suggest it as a alternative lever for improving the robustness and fairness of machine learning models.
Abstract: Despite the constant development of new bias mitigation methods for machine learning, no method consistently succeeds, and a fundamental question remains unanswered: when and why do bias mitigation techniques fail? In this paper, we hypothesise that a key factor may be the often-overlooked but crucial step shared by many bias mitigation methods: the definition of subgroups. To investigate this, we conduct a comprehensive evaluation of state-of-the-art bias mitigation methods across multiple vision and language classification tasks, systematically varying subgroup definitions, including coarse, fine-grained, intersectional, and noisy subgroups. Our findings reveal that subgroup choice significantly impacts performance, with certain groupings paradoxically leading to worse outcomes than no mitigation at all. They suggest that observing a disparity between a set of subgroups is not a sufficient reason to use those subgroups for mitigation. Through theoretical analysis, we explain these phenomena and uncover a counter-intuitive insight that, in some cases, improving fairness with respect to a particular set of subgroups is best achieved by using a different set of subgroups for mitigation. Our work highlights the importance of careful subgroup definition in bias mitigation and presents it as an alternative lever for improving the robustness and fairness of machine learning models.
Lay Summary: There are increasing reports of bias in the performance of machine learning models. This can manifest for example as unequal performance across key population subgroups, such as between men and women. Methods called ``bias mitigation methods'' have been developed to try to prevent machine learning models from learning these biases. For instance, a very simple method involves rebalancing a model's training data so that the model learns from a more equal number of data from the population subgroups of interest (e.g. equal representation of men and women). However, recently, many works have highlighted that these bias mitigation methods often fail in practice.
In this paper, we seek to understand why these methods are failing so often. We consider a crucial but under-looked step of the bias mitigation process, subgroup definition. This step is required by almost all methods, but very little work has looked into whether this step could actually be optimised, and most of the time the same, coarse subgroups are used (e.g. male/female or white/non-white).
To understand this, we do extensive experiments in four different datasets applying a range of established bias mitigation methods to different possible subgroup combinations. We find that performance is very dependent on the subgroups used, and we gather key insights on how to best define subgroups for optimal mitigation. Overall, our work highlights the importance of careful subgroup definition in bias mitigation and suggest it as a alternative lever for improving the robustness and fairness of machine learning models.
Link To Code: https://github.com/anissa218/subgroups_bias_mit
Primary Area: Social Aspects->Fairness
Keywords: bias mitigation, robustness, spurious correlations, generalisation, fairness
Submission Number: 15799
Loading