Abstract: Achieving fairness in AI systems is a critical yet challenging task due to conflicting metrics and their underlying societal assumptions, e.g., the extent to which racist and sexist societal processes are presumed to cause harm and the extent to which we should apply affirmative corrections. Moreover, these measures often contradict each other and might also make the AI system less accurate. This work takes a step towards a unifying human-centered fairness framework to guide stakeholders in navigating these complexities, including their potential incompatibility and the corresponding trade-offs. Our framework acknowledges the spectrum of fairness definitions —individual vs. group fairness, infra-marginal (politically conservative) vs. intersectional (politically progressive) treatment of disparities— allowing stakeholders to prioritize desired outcomes by assigning weights to various fairness considerations, trading them off against each other, as well as predictive performance, supporting stakeholders in exploring the impacts of their fairness choices to achieve a consensus solution. Our learning algorithms then ensure the resulting AI system reflects the stakeholder-chosen priorities. By enabling multi-stakeholder compromises, our framework can potentially mitigate individual analysts’ subjectivity. We performed experiments to validate our methods on the UCI Adult census dataset and the COMPAS criminal recidivism dataset.
Loading