Abstract: As machine learning (ML) algorithms are used in applications that involve humans, concerns have arisen that these algorithms may be biased against certain social groups. $\textit{Counterfactual fairness}$ (CF) is a fairness notion proposed in Kusner et al. (2017) that measures the unfairness of ML predictions; it requires that the prediction perceived by an individual in the real world has the same marginal distribution as it would be in a counterfactual world, in which the individual belongs to a different group. Although CF ensures fair ML predictions, it fails to consider the downstream effects of ML predictions on individuals. Since humans are strategic and often adapt their behaviors in response to the ML system, predictions that satisfy CF may not lead to a fair future outcome for the individuals. In this paper, we introduce $\textit{lookahead counterfactual fairness}$ (LCF), a fairness notion accounting for the downstream effects of ML models which requires the individual $\textit{future status}$ to be counterfactually fair. We theoretically identify conditions under which LCF can be satisfied and propose an algorithm based on the theorems. We also extend the concept to path-dependent fairness. Experiments on both synthetic and real data validate the proposed method.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Mingming_Gong1
Submission Number: 3312
Loading