TL;DR: Iterative Vectors enhance in-context learning by editing activations in language models with simulated gradients, demonstrating significant improvements across tasks without backpropagation.
Abstract: In-context learning has become a standard approach for utilizing language models.
However, selecting and processing suitable demonstration examples can be challenging and time-consuming, especially when dealing with large numbers of them.
We propose Iterative Vectors (IVs), a technique that explores activation space to enhance in-context performance by simulating gradient updates during inference.
IVs extract and iteratively refine activation-based meta-gradients, applying them during inference without requiring backpropagation at any stage.
We evaluate IVs across various tasks using four popular models and observe significant improvements.
Our findings suggest that in-context activation steering is a promising direction, opening new avenues for future research.
Lay Summary: Large language models (LLMs) are powerful, often adapted for specific tasks using "In-Context Learning" (ICL) by showing examples in the prompt. However, ICL struggles with choosing examples, prompt length, and consistency.
Our research presents Iterative Vectors (IVs) as a new approach. IVs capture hidden, task-specific internal adjustments that LLMs learn from examples. The key is an iterative process: we extract and refine these internal adjustments by processing examples in batches, making the IVs more stable and effective.
Once refined, these IVs can "steer" the model's internal state to perform a task. This happens without the examples used to produce the IVs or requiring expensive retraining.
Tests show IVs significantly improve performance compared to standard ICL, while also being more efficient and reliable. This work offers a promising new way to adapt LLMs by directly leveraging and manipulating their internal task representations.
Link To Code: https://github.com/ArkciaTheDragon/iterative-vectors
Primary Area: Deep Learning->Large Language Models
Keywords: In-Context Learning, Activation Steering, Backpropagation-Free
Submission Number: 11849
Loading