Boosting Generalizable Fairness With Mahalanobis Distances Guided Boltzmann Exploratory Testing

Published: 01 Jan 2025, Last Modified: 15 Sept 2025IEEE Trans. Software Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Although machine learning models have been remarkably effective for decision-making tasks such as employment, insurance, and criminal justice, it remains urgent yet challenging to ensure model predictions are reliable and socially fair. This amounts to detecting and repairing potential discriminatory defects of machine learning models extensively with authentic testing data. In this paper, we propose a novel Mahalanobis distance guided Adaptive Exploratory Fairness Testing (MAEFT) approach, which searches for individual discriminatory instances (IDIs) through deep reinforcement learning with an adaptive extension of Boltzmann exploration, and significantly reduces overestimation. MAEFT uses Mahalanobis distances to guide the search with realistic correlations between input features. Thus, through learning a more accurate state-action value approximation, MAEFT can touch a much wider valid input space, reducing sharply the number of duplicate instances visited, and identify more unique tests and IDIs calibrated for the realistic feature correlations. Compared with state-of-the-art black-box and white-box fairness testing methods, our approach generates on average 4.65%-161.66% more unique tests and identifies 154.60%-634.80% more IDIs, with a performance speed-up of 12.54%-1313.47%. Moreover, the IDIs identified by MAEFT can be well exploited to repair the original models through retraining. These IDIs lead to, on average, a 59.15% boost in model fairness, 15.94%-48.73% higher than those identified by the state-of-the-art fairness testing methods. The models retrained with MAEFT also exhibit 37.66%-46.81% stronger generalization ability than those retrained with the state-of-the-art fairness testing methods.
Loading