Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update

Abdulla Jasem Almansoori; Maria Ivanova; Andrey Veprikov; Aleksandr Beznosikov; Samuel Horváth; Martin Takáč

Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update

Abdulla Jasem Almansoori, Maria Ivanova, Andrey Veprikov, Aleksandr Beznosikov, Samuel Horváth, Martin Takáč

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: lora, low-rank, optimization, memory-efficient, svd

TL;DR: OPLoRA is a memory-efficient optimizer LoRAs that generalizes recent preconditioned LoRA optimizers. It has an inner loop that refines preconditioning, approaching the (locally-optimal) truncated SVD of a full step.

Abstract: Low-Rank Adaptation (LoRA) fine-tunes large models by learning low-rank updates on top of frozen weights, dramatically reducing trainable parameters and memory. However, there is still a gap between full training with low-rank projections (\svdlora) and LoRA fine-tuning, indicating that LoRA steps can be further improved. In this study, we propose OPLoRA, a memory-efficient optimizer that closes this gap by casting LoRA optimization as an interpretable sub-problem and solving it efficiently with alternating least squares updates, where 1-2 alternating steps are empirically found to be sufficient to closely match truncated SVD without ever forming the full matrix. We also retrieve the recently proposed preconditioning methods for LoRA as a special case. OPLoRA supports momentum by maintaining a low-rank estimate using the same subroutine (LoRSum) for computing the step, with a memory budget of 3 times the number of LoRA parameters (i.e., same as Adam). We also propose an experimental scaled variant that uses the K-FAC metric, which could be of interest. Across a linear task, MNIST, CIFAR-100, and RoBERTa-base (MNLI), OPLoRA consistently approaches SVDLoRA's performance using significantly less memory.

Submission Number: 105

Loading