Are There Exceptions to Goodhart's law? On the Moral Justification of Fairness-Aware Machine Learning

Hilde Weerts, Lambèr Royakkers, Mykola Pechenizkiy

Published: 22 Oct 2025, Last Modified: 15 Dec 2025ACM Journal on Responsible ComputingEveryoneRevisionsCC BY-SA 4.0

Abstract: Fairness-aware machine learning (fair-ml) techniques are algorithmic interventions designed to ensure that individuals who are affected by the predictions of a machine learning model are treated fairly. The problem is often posed as an optimization problem, where the objective is to achieve high predictive performance under a quantitative fairness constraint. However, any attempt to design a fair-ml algorithm must assume a world where Goodhart’s law has an exception: when a fairness measure becomes an optimization constraint, it does not cease to be a good measure. In this paper, we argue that fairness measures are particularly sensitive to Goodhart’s law. Our main contributions are as follows. First, we present a framework for moral reasoning about the justification of fairness metrics. In contrast to existing work, our framework incorporates the belief that whether a distribution of outcomes is fair, depends not only on the cause of inequalities but also on what moral claims decision subjects have to receive a particular benefit or avoid a burden. We use the framework to distil moral and empirical assumptions under which particular fairness metrics correspond to a fair distribution of outcomes. Second, we explore the extent to which employing fairness metrics as a constraint in a fair-ml algorithm is morally justifiable, exemplified by the fair-ml algorithm introduced by Hardt et al. [21]. We illustrate that enforcing a fairness metric through a fair-ml algorithm often does not result in the fair distribution of outcomes that motivated its use and can even harm the individuals the intervention was intended to protect.

External IDs:doi:10.1145/3771734