Keywords: Causal Fairness, Distributional Testing, Potential Outcomes Framework
Abstract: Causality is widely used in fairness analysis to prevent discrimination on sensitive attributes, such as gender in hiring and race in crime prediction. However, the conventional potential outcomes framework emphasizes group-level causal estimation but lacks a population-level (distributional) perspective in fairness analysis, which may lead to a fairness illusion—misrepresenting fairness by approximating the distributional information using limited statistics (e.g., first- or second-order statistics such as mean or variance). To address this limitation, we define a distribution-based potential outcomes framework in a reproducing kernel Hilbert space, based on which we reframe fairness analysis as a problem of Distributional Causal Fairness Testing (DCFT). Within DCFT, the null hypothesis assumes that a sensitive attribute is fair if its factual and counterfactual potential outcome distributions are sufficiently close. This discrepancy is quantified as the defined Distributional Counterfactual Treatment Effect, which serves as the test statistic in DCFT. To ensure the reliability of testing results, we establish the testing consistency of distributional counterfactual treatment effect through rigorous theoretical analysis. Furthermore, DCFT offers fine-grained control over fairness criteria through a tunable fairness confidence $\epsilon$, enabling flexible fairness sensitivity. Extensive experiments on real-world datasets demonstrate that DCFT reliably diagnoses unfairness in deep models, validating its practical effectiveness and theoretical soundness.
Supplementary Material: zip
Primary Area: causal reasoning
Submission Number: 6019
Loading