NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models

Cheng Lin; Lujun Li; Dezhi Li; You-Liang Huang; Tianyu Wu; Jie Zou; Wei Xue; Yike Guo

NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models

Cheng Lin, Lujun Li, Dezhi Li, You-Liang Huang, Tianyu Wu, Jie Zou, Wei Xue, Yike Guo

13 Sept 2024 (modified: 13 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Parameter-efficient fine-tuning, Low-Rank Adaptation, Large Language Models

TL;DR: We present NoRA, a novel nested parameter-efficient Low-Rank Adaptation (LoRA) structure, optimizes large model fine-tuning by employing serial structures and activation-aware singular value decomposition

Abstract: Low-Rank Adaptation (LoRA) has become a popular paradigm for fine-tuning large models, but it still necessitates a substantial number of training parameters. To address this issue, we first conduct comprehensive empirical studies on parameter-efficient LoRA structure. Then, we establish design guidelines that emphasize the use of serial structures, optimal placements, and nested LoRA. Based on these insights, we present NoRA, a nested parameter-efficient LoRA structure that revolutionizes the initialization and fine-tuning of projection matrices. Our NoRA's innovative approach involves freezing outer layer LoRA weights and employing a serial inner layer design, enabling precise task-specific adaptations while maintaining compact training parameters. In addition, we propose an activation-aware Singular Value Decomposition (AwSVD) that adjusts the weight matrices based on activation distributions for initialization of outer layer LoRA weights. This schema enhances decomposition accuracy and mitigates computational errors. Extensive evaluations across multiple linguistic and visual tasks demonstrate that NoRA outperforms state-of-the-art LoRA variants, achieving significant improvements in efficiency and effectiveness on models such as Mistral-7B, Gemma-7B, and LLaMA-3 8B. Notably, NoRA reduces fine-tuning parameters|training-time|memory-usage by 85.5\%|37.5\%|8.9\% and enhances performance by 1.9\%, compared to LoRA on LLaMA-3 8B. Codes are available in the supplementary materials.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 86

Loading