HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Chunlin Tian; Zhan Shi; Zhijiang Guo; Li Li; Cheng-zhong Xu

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Chunlin Tian, Zhan Shi, Zhijiang Guo, Li Li, Cheng-zhong Xu

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 oralEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: Large Language Models, Efficient Fine-Tuning, Asymmetric Structure

Abstract: Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for improved PEFT approaches that can achieve better performance. Through a series of experiments, we have uncovered two critical insights that shed light on the training and parameter inefficiency of LoRA. Building on these insights, we have developed HydraLoRA, a LoRA framework with an asymmetric structure that eliminates the need for domain expertise. Our experiments demonstrate that HydraLoRA outperforms other PEFT approaches, even those that rely on domain knowledge during the training and inference phases. Our anonymous codes are submitted with the paper and will be publicly available. Code is available: https://github.com/Clin0212/HydraLoRA.

Primary Area: Generative models

Submission Number: 1402

Loading