Keywords: Continual Learning, Large Language Models, Parameter-Efficient Fine-Tuning, Modular Adaptation, Non-parametric Routing
TL;DR: CaLLM solves continual LLM adaptation through fixed-cost modularity. By routing tasks to a bounded pool of lightweight adapters, it prevents catastrophic forgetting while keeping memory and compute costs strictly constant.
Abstract: Re-training Large Language Models (LLMs) each time a new task or domain emerges is neither practical nor cost-effective, making continual adaptation a central challenge. We introduce CaLLM, featuring metric-based meta-learning and modular parameter isolation via lightweight adapters with controllable plasticity. Each input is projected into a task-agnostic embedding space and routed to its nearest-prototype adapter through a non-parametric, task- and regime-agnostic router. New adapters are spawned when the signal is novel, or the least-used adapter is repurposed once capacity is full, yielding a bounded-cost form of intentional knowledge removal. CaLLM scales to many tasks by projecting inputs into a task-agnostic space and routing them to the nearest prototype adapter with a non-parametric, training-free router. As a result, training FLOPs, memory use, and routing overhead stay strictly constant regardless of the adapter pool size. We evaluate CaLLM on a cross-domain shift and a long-horizon dialogue stream, under both batched and online scenarios, where CaLLM outperforms baselines overall with the largest gains on the hardest tasks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 61
Loading