Intervention-Driven Correlation Reduction: A Data Generation Approach for Achieving Counterfactually Fair Predictors

Published: 2025, Last Modified: 28 Jan 2026ICDE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Achieving counterfactual fairness is a critical objective in advancing fairness research within machine learning. Studies have shown that machine learning models often inherit biases from their training data, leading to unfair decision-making. Fair data generation methods aim to mitigate these biases, ensuring that predictors trained on such data uphold fairness. However, in the context of counterfactual fairness, existing methods for generating fair data are often limited in their applicability and lead to significant performance losses in downstream predictors. To address these issues, this paper proposes a new algorithm for generating counterfactually fair data, allowing predictors trained on this generated data to adhere to counterfactual fairness. We propose a new metric, Intervention-Driven Correlation (IDC), to evaluate the counterfactual fairness of generative models. IDC assesses fairness by applying random interventions to samples and measuring the statistical correlation between the degree of intervention and the outcome of interest. This metric is applicable to both discrete and continuous sensitive attributes and labels. Furthermore, our studies reveal a critical insight: counterfactually fair data does not always guarantee counterfactually fair predictors when deployed in real-world scenarios. We identify the root causes of this issue and propose a robust solution. To bridge this gap, we propose the IDC-Reduction method, which ensures the fairness of downstream predictors by generating counterfactually fair data. Experimentally, our method outperforms existing approaches and achieves counterfactual fairness regardless of the type of downstream predictors.
Loading