Keywords: Counterfactual, Counterfactual Fairness, Foundation Models
Abstract: The widespread application of foundation models across various domains raises significant concerns regarding fairness and bias.
In this work, we focus on a specific notion of fairness, Counterfactual Fairness (CF), which posits that an individual's outcome should remain consistent if they had belonged to a different sensitive group.
CF is grounded in an underlying causal model, and it typically necessitates either access to the true causal model or the availability of counterfactual pairs.
While previous studies have made some progress when such information is available, acquiring it is often challenging in real-world applications.
In this paper, we target at achieving CF in a more practical setting where limited causal knowledge is available.
We demonstrate that naive adaptations of existing methods are inadequate in such contexts through extensive empirical studies.
To bridge the gap, we first introduce a more carefully designed approach for generating counterfactuals in practice, compatible with existing methodologies.
Subsequently, we present a technique for utilizing estimated counterfactuals and potentially biased pretraind models.
The feasibility of our approaches is validated through both theory and empirical investigation.
Submission Number: 21
Loading