Lotus: Low-Rank Efficient LLM Training with Adaptive Subspace Switching

Lotus: Low-Rank Efficient LLM Training with Adaptive Subspace Switching

ACL ARR 2025 May Submission3748 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Memory-efficient learning is crucial for reducing GPU consumption and enabling scalable training of large language models. Low-rank adaptation has proven effective for fine-tuning by injecting low-rank matrices into frozen pre-trained weights. However, these methods often degrade to full-rank training due to limited expressiveness and disrupted optimization dynamics. Conversely, projecting gradient updates within a low-rank subspace improves both training performance while simultaneously decreasing memory overhead. In this paper, we propose \textbf{Lotus}, a method that speeds up gradient projection via randomized SVD and further reduces memory cost. In addition, we propose an \textbf{adaptive subspace switching strategy} guided by the average displacement of the unit gradient, which enables dynamic subspace updates for improved convergence performance. Experimental results demonstrate that Lotus is currently \textbf{the most efficient method}, surpassing full-rank training in pre-training LLaMA-type models on the C4 dataset, as well as fine-tuning across multiple tasks. Our code will soon be available.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: Efficient/Low-Resource Methods for NLP, Large Language Model, Pre-training, Fine-tuning

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 3748

Loading