Keywords: LoRA, Stable features learning, Diffusion models fine-tuning
Abstract: Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient fine-tuning of large pretrained models.
LoRA augments the pre-trained weights of a model by adding the product of two smaller matrices that together form a low-rank matrix update.
Recent research has shown that scale disparities between these two matrices often cause unstable training dynamics, leading to suboptimal performance.
In this paper, we reformulate low-rank adaptation by learning the weights update as a decomposition of a \textbf{single} low-rank matrix multiplied by its transpose.
This design, called \slora, inherently removes inter-matrix scale conflicts, ensuring stable optimization, and roughly halves the parameter count.
We analyze \slora within the efficient feature learning framework, showing that it guarantees stability by construction while preserving the expressive capacity of LoRA in the transformer architecture.
Extensive experiments on multiple tasks validate these benefits.
In common sense reasoning, fine-tuning LLama 7B on MNLI with \slora achieves 91.3\% accuracy — surpassing LoRA (89.1\%) and LoRA+ (90.2\%) — while using only 60\% of their parameter budget. In image generation, fine-tuning Stable Diffusion with \slora significantly improves image fidelity on DreamBooth, achieving a DINO similarity score of 0.151, compared to scores of 0.148 and 0.143 for DoRA and LoRA, respectively.
Submission Number: 101
Loading