Abstract: Fine-tuning large language models (LLMs) is computationally expensive, and Low-Rank Adaptation (LoRA) offers a cost-effective alternative by approximating weight updates with low-rank matrices. In multi-task learning (MTL) scenarios, while recent works have introduced multi-head LoRA variants to capture task-specific knowledge across different tasks, we observe a high degree of similarity among head matrices, questioning the necessity of such structural complexity for multi-task generalization. In this work, we propose R-LoRA+, a simplified but competitive multi-head LoRA. We highlight that increasing the rank of standard LoRA suffices to match or even surpass the performance of methods with multi-adapter or multi-head, suggesting that structural diversification may not be necessary for multi-task generalization.
Furthermore, we find that explicitly encouraging shared representation learning leads to more effective adaptation under parameter-efficient fine-tuning. Experimental results confirm that focusing on shared knowledge across tasks improves multi-task generalization while preserving the deployment-friendly nature of LoRA.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: parameter-efficient-training, multi-task learning
Contribution Types: Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 1373
Loading