Bi-LoRA: Efficient Sharpness-Aware Minimization for Fine-Tuning Large-Scale Models

ICLR 2026 Conference Submission11559 Authors

18 Sept 2025 (modified: 27 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: low-rank adaptation, efficient training, generalization
TL;DR: We propose Bi-LoRA that introduces an auxiliary LoRA module to model SAM’s adversarial weight perturbations, enhancing generalization while avoiding doubled computational costs.
Abstract: Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of large pre-trained models. Yet LoRA can face generalization challenges. One promising way to improve the generalization is Sharpness-Aware Minimization (SAM), which has proven effective for small-scale training scenarios. In this paper, we propose **Bi**-directional **Lo**w-**R**ank **A**daptation (Bi-LoRA), which introduces an auxiliary adversarial LoRA module. This design explicitly decouples sharpness optimization, handled by the auxiliary module, from task adaptation, performed by the primary module. Such a separation yields two key benefits. First, it transforms the sequential computation of primary LoRA update and adversarial perturbation into a parallel form, which roughly halves the time and conquers the main obstacle of applying SAM in LoRA. Second, it provides perturbations from the auxiliary module that do not collapse into the restricted optimization subspace of the primary module, enabling broader sharpness exploration and flatter minima. Bi-LoRA simultaneously achieves both efficiency and effectiveness within a single framework, as verified by extensive experiments across diverse architectures and tasks.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 11559
Loading