Keywords: Steering, Causal interventions, Understanding high-level properties of models
Other Keywords: In-context Learning, Task Vectors, Large Language Models
TL;DR: We propose Adaptive Task Vectors (ATV), a framework that dynamically generates input-conditioned task vectors for large language models, offering strong performance and greater flexibility than LoRA and previous fixed-vector or prompt-based methods.
Abstract: In-Context Learning (ICL) enables Large Language Models (LLMs) to perform tasks without parameter updates by conditioning on a few demonstrations provided in the prompt. Despite its success, ICL suffers from several limitations, including sensitivity to demonstration order, context length constraints, and limited control over internal reasoning mechanisms. To address these challenges, task vector-based approaches compress task information into a single vector. However, these methods typically construct task vectors from fixed sets of demonstrations and reuse them across input queries, without conditioning on the specific input. This limitation reduces their ability to probe or guide the model’s internal computation, making adaptation to diverse or misaligned queries difficult and degrading generalization on unseen tasks. To overcome this limitation, we propose Adaptive Task Vectors (ATV), a simple and effective framework that dynamically generates task vectors conditioned on each input query. ATV employs a small language model to generate task vectors, which are then transformed to match the target LLM’s architecture and applied to guide its output generation. In contrast to ICL and previous vector-based approaches, which rely on fixed demonstration sets and their corresponding vectors, ATV dynamically generates task vectors tailored to each specific input query and task. As a result, ATV serves as an effective tool for probing and guiding the internal mechanisms of LLMs, enabling strong performance and enhanced insight, even for unseen tasks. Furthermore, we provide a theoretical analysis indicating that ATV is expressively equivalent to LoRA under equal rank budgets and more expressive than Prefix-Tuning, thereby offering formal support for its representational advantage.
Submission Number: 164
Loading