The Chosen Few: Sparse Adaptation for Large Models

06 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: PEFT, finetuning, SAE
Abstract: Parameter-Efficient Fine-Tuning (PEFT) methods have become essential for adapting large pretrained models to downstream tasks, with Low-Rank Adaptation (LoRA) emerging as one of the most widely adopted solutions. However, there remain several key limitations in current LoRA-based PEFT methods: (1) the low-rank feature space in LoRA is rigid, reducing its capacity for dynamic adaptation; (2) the restricted dimensionality, coupled with dense and entangled representations, constrains the model’s capacity to generalize across multiple domains; and (3) the compression process limits the extent to which model behavior can be understood from the learned representations, making it difficult to interpret the functional role of task-relevant features. In this paper, we argue that sparse adaptation offers a principled and more flexible alternative to low-rank adaptation, with the added benefit of enhancing interpretability. Instead of compressing information into a low-rank subspace, sparse adaptation focuses on identifying and selectively activating a small subset of high-dimensional latent features, enabling a more decomposed and dynamic fine-tuning process. Building on this paradigm, we propose STAN (Sparse adapTAtioN), a novel method that actualizes sparse adaptation by integrating dedicated Sparse Autoencoder (SAE) modules into frozen pretrained models. STAN learns to encode task-specific adaptations through sparse activations within the SAEs, thereby using sparse features as the mechanism for dynamic and robust adaptation. Beyond the flexibility offered by input-dependent sparse combinations, the large latent space of the SAEs provides scalable capacity for cross-domain adaptation, while their inherent semantic decomposition structure supports more interpretable representations. Through extensive experiments, we demonstrate that STAN outperforms state-of-the-art PEFT baselines across a range of benchmarks, while uniquely enabling inspection and analysis of the learned sparse activations. Our findings position sparse adaptation as a promising new direction in PEFT, advancing both the expressivity and interpretability of model adaptation.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 2686
Loading