Abstract: Large language models (LLMs) are often fine-tuned for specific tasks using Low-Rank Adaptation (LoRA), an efficient method that adds small, task-specific modules called LoRA adapters to a pre-trained base model. However, a major challenge arises when merging multiple LoRA adapters trained on different data sources for a specific task: it often leads to \textit{task interference}, which refers to the redundancy or sign discrepancies found in parameters across different task models, resulting in information conflict and performance loss. While SVD-based merging methods show promise by decomposing adapters into orthogonal components to reduce cross-task interference, they suffer from a critical limitation: SVD decomposition treats the LoRA adapters merely as matrices, which prevents the identification of the optimal orthogonal basis, limiting these approaches from effectively reducing the task interference. To address this, we propose a novel LoRA merging approach using joint Canonical Polyadic (CP) decomposition, which we term CP Merging. We first aggregate the LoRA adapters into a single third-order tensor. Subsequently, we apply CP decomposition to this tensor to disentangle factors that are unique to each task from those that are shared across tasks. This joint factorization inherently helps to reduce cross-task interference without sacrificing critical information. Our extensive experiments further validate this approach, demonstrating that CP merging yields superior performance compared to existing SVD-based merging approaches.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Guillaume_Obozinski3
Submission Number: 6523
Loading