RepLoRA: Reparameterizing Low-rank Adaptation via the Perspective of Mixture of Experts

Tuan Truong; Chau Nguyen; Huy Nguyen; Minh Le; Trung Le; Nhat Ho

RepLoRA: Reparameterizing Low-rank Adaptation via the Perspective of Mixture of Experts

Tuan Truong, Chau Nguyen, Huy Nguyen, Minh Le, Trung Le, Nhat Ho

Published: 01 May 2025, Last Modified: 16 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Low-rank Adaptation (LoRA) has emerged as a powerful and efficient method for fine-tuning large-scale foundation models. Despite its popularity, the theoretical understanding of LoRA has remained underexplored. In this paper, we present a theoretical analysis of LoRA by examining its connection to the Mixture of Experts models. Under this framework, we show that a simple technique, reparameterizing LoRA matrices, can notably accelerate the low-rank matrix estimation process. In particular, we prove that reparameterization can reduce the data needed to achieve a desired estimation error from an exponential to a polynomial scale. Motivated by this insight, we propose **Rep**arameterized **Lo**w-**R**ank **A**daptation (RepLoRA), incorporating a lightweight MLP to reparameterize the LoRA matrices. Extensive experiments across multiple domains demonstrate that RepLoRA consistently outperforms vanilla LoRA. With limited data, RepLoRA surpasses LoRA by a substantial margin of up to **40.0%** and achieves LoRA's performance using only **30.0%** of the training data, highlighting the theoretical and empirical robustness of our PEFT method.

Lay Summary: Fine-tuning large AI models for specific tasks can be resource-intensive. Low-Rank Adaptation (LoRA) offers a solution by adjusting only a small portion of the model's parameters, making the process more efficient. However, the theoretical understanding of LoRA has been limited. In this paper, we developed a more efficient parameter-efficient fine-tuning (PEFT) method that builds upon Low-rank Adaptation (LoRA). To do so, we first provide a theoretical analysis of LoRA's sample efficiency from the perspective of a Mixture of Experts (MoE). This perspective reveals that reparameterizing LoRA matrices—essentially making the low-rank matrices the outputs of two small non-linear MLPs instead of optimizing them directly—can significantly accelerate the low-rank matrix estimation process. Specifically, we demonstrate that this reparameterization reduces the amount of data needed to achieve a certain level of estimation error from an exponential to a polynomial scale. Building on this insight, we introduce RepLoRA (Reparameterized Low-Rank Adaptation), which incorporates lightweight neural networks to reparameterize the LoRA matrices. Our extensive experiments across various domains show that RepLoRA consistently outperforms the standard LoRA approach. Notably, with limited data, RepLoRA achieves up to 40% better performance and matches LoRA's results using only 30% of the training data. Hence, this work not only provides a deeper theoretical understanding of LoRA but also offers a practical method to make AI fine-tuning more data-efficient and effective.

Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning

Keywords: Parameter-efficient fine-tuning, Low-rank adaptation, Reparameterization, Mixture of experts

Submission Number: 4304

Loading