Disentangling Multimodal Knowledge Preservation and Editing via Low-rank Adaptation

11 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Knowledge editing, Multimodal large language models, Parameter-efficient fine-tuning
TL;DR: This work disentangles knowledge preservation and editing in parameter-efficient fine-tuning for multimodal large language models.
Abstract: Knowledge editing facilitates precise and targeted updates in Large Language Models (LLMs) and Multimodal Models (LMMs) without the need for full retraining. Although existing editing methods achieve remarkable performance in textual modality, they still struggle to simultaneously preserve pre-trained knowledge and generalize new knowledge in intricate multimodal contexts. To address this challenge, we propose ELoRA, a novel solution that disentangles the conflicting editing objectives. Specifically, ELoRA decomposes the standard Low-Rank Adaptation (LoRA) update into two complementary subspaces: (1) a null space aligned with preserved knowledge, constructed via multimodal initialization to maintain the model's general capabilities, and (2) a knowledge space extracted from the model's internal states to multimodal perturbations, capturing invariant semantics of updates. Extensive experiments on various LMMs, including LLaVA-v1.5-7B, Qwen2.5-VL-7B, and Phi-4-multimodal, show that ELoRA outperforms most LoRA-based methods by an average of 14.2% accuracy across three metrics: reliability, generality, and locality under rigorous LLM-as-a-Judge evaluation, which demonstrates that ELoRA can achieve superior real-world editing quality.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 3972
Loading