Keywords: Parameter-Efficient Fine-Tuning, LoRA, LLM
TL;DR: We present a novel approach that addresses the low-rank bottleneck in LoRA by integrating nonlinear mappings with compressed rank, achieving an optimal balance between parameter efficiency and model performance.
Abstract: Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method validated across NLP and CV domains. However, LoRA faces an inherent low-rank bottleneck: narrowing its performance gap with full fine-tuning requires increasing the rank of its parameter matrix, resulting in significant parameter overhead. Recent linear LoRA variants have attempted to enhance expressiveness by introducing additional linear mappings; however, their composition remains inherently linear and fails to fundamentally improve LoRA’s representational capacity. To address this limitation, we propose \ourmethod, which incorporates an Adaptive Nonlinear Layer (ANL) between two linear projectors to capture \emph{fixed} and \emph{learnable} nonlinearities. This combination forms an {\fontfamily{lmtt}\selectfont \textbf{MLP-like structure}} with a compressed rank, enabling flexible and precise approximation of diverse target functions while theoretically guaranteeing lower approximation errors and bounded gradients. Extensive experiments on 22 datasets and 6 pretrained models demonstrate that \ourmethod: (\textbf{I}) not only matches or surpasses full fine-tuning performance with only $6.18\%\sim25\%$ of LoRA’s parameters but also (\textbf{II}) outperforms state-of-the-art PEFT methods by up to $10.88\%$ in both NLP and CV tasks, and \textbf{(III)} exhibits robust performance across various rank configurations.
Supplementary Material: zip
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 1532
Loading