NUCLEAR-NORM MAXIMIZATION FOR LOW-RANK UPDATES

Published: 18 Apr 2024, Last Modified: 07 May 2025OpenReview Archive Direct UploadEveryoneCC BY-SA 4.0
Abstract: Pre-trained large language models exhibit significant potential in speech and language processing. Fine-tuning all parameters becomes impractical when confronted with numerous downstream tasks. To address this challenge, various low-rank adaptation techniques have been introduced for parameter-efficient fine-tuning, which freeze the overparametrized models and learn incremental parameter updates within smaller subspaces. However, our observation reveals that most directions of the learned subspace play a minor role in the incremental updates. Consequently, fine-tuned models may not achieve optimal performance. To bridge this gap, we introduce NNM-LoRA, which strives to harness more meaningful singular directions. Through Nuclear Norm Maximization (NNM), we can better regulate the allocation of singular values. Accordingly, we propose a parameter-free plug-and-play regularizer for low-rank updates. This innovative approach allows us to utilize as many singular directions of the subspace as possible during the training of low-rank updates. To validate the effectiveness of NNM-LoRA, we conduct extensive experiments involving different pre-trained models on various natural language understanding tasks. Results demonstrate that NNM-LoRA exhibits significant improvements compared to baseline methods.
Loading