Emergence and Effectiveness of Task Vectors in In-Context Learning: An Encoder Decoder Perspective

Seungwook Han; Jinyeop Song; Jeff Gore; Pulkit Agrawal

Emergence and Effectiveness of Task Vectors in In-Context Learning: An Encoder Decoder Perspective

Seungwook Han, Jinyeop Song, Jeff Gore, Pulkit Agrawal

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We study how task vectors emerge in transformers through an encoder-decoder framework. We show that the ability to decode task from representations correlates with task vector effectiveness and predicts LLMs' in-context learning ability.

Abstract: Autoregressive transformers exhibit adaptive learning through in-context learning (ICL), which begs the question of how. Prior works have shown that transformers represent the ICL tasks as vectors in their representations. In this paper, we leverage the encoding-decoding framework to study how transformers form task vectors during pretraining and how their task encoding quality predicts ICL task performance. On synthetic ICL tasks, we analyze the training dynamics of a small transformer and report the coupled emergence of task encoding and decoding. As the model learns to encode different latent tasks (e.g., "Finding the first noun in a sentence.") into distinct, separable representations, it concurrently builds conditional decoding algorithms and improves its ICL performance. We validate this phenomenon across pretrained models of varying scales (Gemma-2 2B/9B/27B, Llama-3.1 8B/70B) and over the course of pretraining in OLMo-7B. Further, we demonstrate that the quality of task encoding inferred from representations predicts ICL performance, and that, surprisingly, finetuning the earlier layers can improve the task encoding and performance more than finetuning the latter layers. Our empirical insights shed light into better understanding the success and failure modes of large language models via their representations.

Lay Summary: Think of today’s large language models as very fast learners that can pick up a new mini-skill—say, “find the first noun in a sentence”—just from the handful of examples you type into the prompt. Researchers call this trick *in-context learning (ICL).* We set out to watch, step-by-step, how a model grows that ability while it is being trained. 1. Inside the model, every mini-skill becomes a direction in its “thought space.” As the model reads billions of sentences during pre-training, it gradually learns to point different skills in different directions so they don’t blur together. 2. The skill label and the algorithm for that skill appear together. The moment those directions separate cleanly, the model also figures out the matching algorithm for using them—so its ICL accuracy shoots up at the same time. 3. Bigger models show the same pattern. We confirmed the effect in Google’s Gemma (2–27B parameters), Meta’s Llama-3.1 (8B & 70B), and throughout the training run of OLMo-7B models. 4. The separability of the skills can predict ICL success. Measuring how distinct those internal directions are tells us, before we even test the model, how well it will do at ICL. Peering at these hidden directions lets us forecast when a language model will excel at on-the-fly learning and when it might stumble.

Link To Code: https://charming-centaur-089.notion.site/Emergence-and-Effectiveness-of-Task-Vectors-in-In-Context-Learning-An-Encoder-Decoder-Perspective-2054664a1d59814f8401cded3332fce4?source=copy_link

Primary Area: Deep Learning->Large Language Models

Keywords: in context learning, task vectors, mechanistic interpretability

Submission Number: 7241

Loading