Keywords: model merging, parameter efficient finetuning, large language models
Abstract: Large language models (LLMs) are often fine-tuned for specific tasks using Low-Rank Adaptation (LoRA), an efficient method that adds small, task-specific modules called LoRA adapters to a pre-trained base model. However, a major challenge arises when merging multiple LoRA adapters trained on different data sources for a specific task: it often leads to task interference, which degrades the model's performance. While recent SVD-based LoRA merging methods have shown promise by decomposing adapters into orthogonal components and keeping only the most important ones, they have an important limitation: These methods process each adapter independently, overlooking potential interactions between different tasks. To address this, we propose a novel LoRA merging method using joint Canonical Polyadic (CP) decomposition (CP merging). We first combine the LoRA adapters into a single third-order tensor. Then, we apply CP decomposition to this tensor to disentangle factors that are unique to each task from those that are shared across tasks. This joint factorization method helps reduce cross-task interference without losing important information. Our extensive experiments on NLP tasks demonstrate that CP merging yields superior performance compared to the existing SVD-based baselines.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 7750
Loading