Context is All You Need

Context is All You Need

ICLR 2026 Conference Submission21899 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: domain generalization, Large Language Model, LLM, activation steering, activation engineering, generative, classification, out of distribution, OOD, style transfer

TL;DR: We introduce CONTXT, a simple and intuitive way to augment contextual information in feature representations that can improve classifier performance and steer LLM outputs without retraining.

Abstract: Artificial Neural Networks (ANNs) are increasingly deployed across diverse domains, often requiring them to generalize beyond their training conditions. This shift in context frequently leads to performance degradation, a central challenge in Domain Generalization (DG). While numerous techniques exist to mitigate this issue (e.g., fine-tuning, activation steering, meta-learning, adversarial training, normalization-based approaches, and parameter-efficient methods such as prompt tuning), they are often complex, resource-intensive, and difficult to scale; particularly for large models like Large Language Models (LLMs). In contrast, we introduce CONTXT (\emph{\textbf{C}ontextual augmentati\textbf{O}n for \textbf{N}eural fea\textbf{T}ure \textbf{X} \textbf{T}ransforms}): a simple, intuitive, and elegant method for contextual adaptation. CONTXT work by augmenting the model’s internal representations with lightweight, contextually relevant feature indexes through straightforward multiplicative and additive vector operations. Despite its simplicity, CONTXT significantly improves performance across both discriminative (e.g., classification with ANNs/CNNs) and generative (e.g., LLMs) tasks. With minimal computational overhead and straight forward integration, CONTXT layers offer a practical and effective solution to DG and a variety of problems facing ANNs, demonstrating that strong results need not come at the cost of complexity. More generally, CONTXT provides a compact mechanism to manipulate information flow and steer ANN processing in a desired direction without retraining the network.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 21899

Loading