Keywords: Large Language Models, Parameter-Efficient Fine-Tuning, Low-rank Adaptation
Abstract: Low-Rank Adaptation (LoRA) is widely used for parameter-efficient fine-tuning (PEFT) of large language models (LLMs). Yet, its uniform activation of all rank components can lead to task interference and hinder generalization when fine-tuning a model to multiple tasks and datasets. We introduce Gated LoRA, which employs input-dependent gating to selectively activate only the most relevant rank-1 directions. The key design is a dual‑purpose projection: the same matrices that compute LoRA features also drive rank selection, adding no extra trainable parameters. Across nine language understanding benchmarks and diverse LLM backbones, Gated LoRA reduces task interference and achieves up to 2.9-point accuracy gains in multi-task settings and 3.6-point gains in single-task fine-tuning over standard LoRA, while incurring negligible inference latency overhead and no additional training parameters. Our results demonstrate that fine-grained, input-dependent adaptation makes LoRA more robust, adaptive, and interference-resistant, suggesting a way for scalable multi-task PEFT in LLMs.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 24170
Loading