Is Our Benchmark Enough? An Analysis of Continual Learning for MLLMs

Published: 23 May 2026, Last Modified: 23 May 2026CATS@ICML26 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: continual learning, continual domain adaptation, multimodal large language models
Abstract: Continual adaptation is essential for multimodal large language models (MLLMs) deployed across evolving domains, but the state-of-the-art MR-LoRA method highly relies on the assumption that a MLLM-based router is necessary to process complex multimodal inputs. This paper revisits this claim on the MLLM-CL benchmark and argues for two claims. **First**, routing does not require an MLLM: a simple training-free, replay-free ptotypical routing method (RePro), uses frozen pretrained features and task prototypes to match the MLLM-based router of MR-LoRA at far lower computational cost. **Second**, shared experts do not improve continual learning for MLLMs, despite their theoretical appeal. We show that these findings arise from two structural limitations of MLLM-CL: (1) its tasks are **highly separable** in representation space, and (2) its fixed task order makes conclusions **sensitive to a single curriculum** rather than robust across diverse continual-learning trajectories. As a result, the benchmark primarily rewards learning in isolation rather than genuine continual transfer. This motivates a new design for future benchmarks of continual MLLM learning, with overlapping task manifolds, multiple task orders, fine-grained domain shifts, and evaluation protocols that reward forward transfer as well as retention.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 34
Loading