Towards Causal Fine-Tuning under Latent-Confounded Shift

Jialin Yu; Yuxiang Zhou; Yulan He; Nevin L. Zhang; Junchi Yu; Philip Torr; Ricardo Silva

Towards Causal Fine-Tuning under Latent-Confounded Shift

Jialin Yu, Yuxiang Zhou, Yulan He, Nevin L. Zhang, Junchi Yu, Philip Torr, Ricardo Silva

18 Sept 2025 (modified: 17 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: causal inference, confounded-shift, causal generalisation

Abstract: Adapting to latent-confounded shift remains a core challenge in modern AI. Such shift is driven by hidden variables that induce spurious, non-transportable correlations between inputs and outputs. A practical failure mode arises when fine-tuning pre-trained foundation models on confounded data (e.g., where certain text tokens or image backgrounds spuriously correlate with the label), leaving models vulnerable at deployment. We introduce *causal fine-tuning, which frame model adaptation as an identification problem* and pose an explicit causal model that decomposes inputs into low-level spurious features and high‐level causal representations. Under this family of models, we formalize the assumptions required for identification. Using pre-trained language models as a case study, we show how identifying and adjusting these components during causal fine-tuning enables automatic adaptation to such shift at test time. Experiments on real-world stress-test benchmarks demonstrate that our method outperforms black-box domain generalization baselines, highlighting the benefits of explicitly modeling causal structure.

Supplementary Material: zip

Primary Area: causal reasoning

Submission Number: 12859

Loading