Keywords: Transformers, Self-Adaptation, Fine-tuning, Dynamic System, LLMs
Abstract: Self-adaptive large language models (LLMs) aim to solve the challenges posed
by traditional fine-tuning methods, which are often computationally intensive and inflexible for diverse tasks.
We introduce $\text{Transformer}^2$, a novel framework that adapts LLMs for unseen tasks in real-time by selectively adjusting singular components of weight matrices, using a two-pass mechanism: task identification followed by mixing of task-specific "expert" vectors to best cope with test-time conditions. Our approach outperforms ubiquitous methods like LoRA with fewer parameters and greater efficiency across various LLM architectures and modalities, and offers a scalable solution for enhancing the adaptability and task-specific performance of LLMs, paving the way for truly self-organizing AI systems.
Submission Number: 73
Loading