Transformers with Endogenous In-Context Learning: Bias Characterization and Mitigation

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: In-Context Learning, Hidden Confounder, Debiasing
TL;DR: This paper presents a pioneer theoretical analysis on the endogenous bias occurred in ICL prediction, with a higly-efficient debiasing method.
Abstract: In-context learning (ICL) enables pre-trained transformers (TFs) to perform few-shot learning across diverse tasks, fostering growing research into its underlying mechanisms. However, existing studies typically assume a causally-sufficient regime, overlooking spurious correlations and prediction bias introduced by hidden confounders (HCs). As HC commonly exists in real-world cases, current ICL understandings may not align with actual data structures. To fill this gap, we contribute the pioneer theoretical analysis towards a novel problem setup termed as ICL-HC, which offers understanding the effect of HC on the pre-training of TFs and the following ICL prediction. Our theoretical results entail that pre-trained TFs exhibits certain prediction bias with proportional to the confounding strength. To migrate such prediction bias, we further propose a gradient-free debiasing method named Double-Debiasing (DDbias) by collecting and prompting with extremely few unconfounded examples, correcting pre-trained TFs with unbiased ICL predictions. Extensive experiments on regression tasks across diverse designs of the TF architectures and data generation protocols verify both our theoretical results and the effectiveness of the proposed DDbias method.
Primary Area: learning theory
Submission Number: 8167
Loading