Keywords: causal inference, confounded-shift, causal generalisation
Abstract: Adapting to latent-confounded shift remains a core challenge in modern AI. Such shift is driven by hidden variables that induce spurious, non-transportable correlations between inputs and outputs. A practical failure mode arises when fine-tuning pre-trained foundation models on confounded data (e.g., where certain text tokens or image backgrounds spuriously correlate with the label), leaving models vulnerable at deployment. We introduce *causal fine-tuning, which frame model adaptation as an identification problem* and pose an explicit causal model that decomposes inputs into low-level spurious features and high‐level causal representations. Under this family of models, we formalize the assumptions required for identification. Using pre-trained language models as a case study, we show how identifying and adjusting these components during causal fine-tuning enables automatic adaptation to such shift at test time. Experiments on real-world stress-test benchmarks demonstrate that our method outperforms black-box domain generalization baselines, highlighting the benefits of explicitly modeling causal structure.
Supplementary Material: zip
Primary Area: causal reasoning
Submission Number: 12859
Loading