Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA

Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA

08 May 2026 (modified: 09 May 2026)ICML 2026 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LoRA, parameter-efficient fine-tuning, batch size, hyperparameter bias

TL;DR: We show that vanilla LoRA remains a strong baseline once batch size is properly tuned, and provide a practical guideline for small-scale proxies to tune batch size in LoRA fine-tuning.

Abstract: Low-rank adaptation (LoRA) is a standard for fine-tuning large language models, yet its many variants report conflicting empirical gains, often on the same benchmarks. We show that these contradictions arise from a single overlooked factor: the batch size. When properly tuned, vanilla LoRA often matches the performance of more complex variants. We further propose a proxy-based, cost-efficient strategy for batch size tuning, revealing the impact of rank, dataset size, and model capacity on the optimal batch size. Our findings elevate batch size from a minor implementation detail to a first-order design parameter, reconciling prior inconsistencies and enabling more reliable evaluations of LoRA variants.

Submission Number: 114

Loading