Unlocking More Granular Control of Memory-efficient LLM Finetuning

Yezhen Wang; Zhouhao Yang; Brian K Chen; Fanyi Pu; Bo Li; Tianyu Gao; Kenji Kawaguchi

Unlocking More Granular Control of Memory-efficient LLM Finetuning

Yezhen Wang, Zhouhao Yang, Brian K Chen, Fanyi Pu, Bo Li, Tianyu Gao, Kenji Kawaguchi

16 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Memory-efficient finetuning, Low-rank gradient projection

TL;DR: We introduce VLoRP, which adjusts Projection Granularity and rank for better memory-performance trade-offs.

Abstract: Low-rank gradient projection (LoRP) has recently emerged as a memory-efficient alternative to low-rank adapters (LoRA) for finetuning large language models. Existing LoRP methods, however, implicitly fix the projection unit to a single gradient row, leaving the effect of grouping multiple rows (or subdividing a row) largely unexplored. In this work, we systematically investigate the impact of the projection unit on LoRP methods. Specifically, we extend existing LoRP approaches by introducing an additional degree of freedom, projection granularity, beyond the traditional rank hyperparameter. This enables a framework capable of performing Various-grained Low-Rank Projection of gradients, which we term VLoRP. Using VLoRP, we observe that, under an identical memory budget, fine-grained projections consistently deliver superior performance. Moreover, VLoRP requires no extra computation and minimal code changes, effectively providing a no-cost accuracy boost to LoRP. Finally, we provide convergence analysis on VLoRP with either SGD or an Adam-based memory-efficient optimizer, and extensive experiments are conducted to validate our findings, covering tasks such as Commonsense Reasoning, MMLU, and GSM8K.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 7764

Loading