Reconciling In-Context and In-Weight Learning: A Dual-Space Modeling Perspective

Guanyu Chen; Ruichen Wang; Tianren Zhang; Feng Chen

Reconciling In-Context and In-Weight Learning: A Dual-Space Modeling Perspective

Guanyu Chen, Ruichen Wang, Tianren Zhang, Feng Chen

04 Sept 2025 (modified: 03 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: in-context learning; in-weight learning; dual-space; model architecture

TL;DR: We propose CoQE, a Transformer with dual-space context-query encoding, reconciling ICL and IWL, and achieving lower ICL error than standard Transformers across diverse tasks and data distributions.

Abstract: In-context learning (ICL) is a valuable capability exhibited by Transformers pretrained on diverse sequence tasks. However, prior studies have observed that ICL often exhibits a conflict with the model’s inherent in-weight learning (IWL) capability. In this work, we aim to reconcile ICL and IWL by disentangling the model’s encoding spaces for context and input samples. To do so, we first propose a dual-space modeling framework, explicitly modeling a task representation space via the dual space of the sample representation space. Such a dual-space structure can be derived from the linear representation hypothesis and, as we theoretically prove, is conducive to ICL by representation learning. Furthermore, we show that the standard Transformer architecture with softmax self-attention is inherently limited in realizing this structure. Building on this insight, we introduce CoQE, a Transformer architecture with separate context-query encoding, to realize the disentanglement between context and sample representations. Through experiments on both regression and classification tasks, we demonstrate that CoQE not only achieves lower ICL error compared to the standard Transformers, but also successfully reconciles ICL and IWL under diverse data distributions.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 1915

Loading