Keywords: Large Language Models; Efficient Fine-Tuning; Low-rank Adaptation;
TL;DR: E-LoRA reduces training overhead and task interference by sharing low-rank adapters across layers and selectively dropping parameters, enabling efficient and robust fine-tuning.
Abstract: Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method for large language models (LLMs), but it still incurs notable overhead and suffers from parameter interference in complex datasets. While recent works decouple LoRA update matrices to exploit matrix-wise asymmetry, training costs remain high. We revisit LoRA from the perspective of inter-matrix and intra-layer parameter redundancy and propose Resource-Efficient Low-Rank Adaptation, ReLoRA, a lightweight and generalizable approach for language, multimodal, and diffusion models. ReLoRA employs a unified A matrix across all transformer layers and introduces a runtime selective B matrices update to dynamically trade-off the system resource budget and model performance. ReLoRA consistently outperforms LoRA across diverse modalities, including commonsense reasoning, visual instruction tuning, and image generation, demonstrating improved efficiency and robustness. Anonymous codes are submitted with the paper and will be publicly available.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 23916
Loading