Keywords: Continual Learning, Subspace Regularization
TL;DR: We propose ELLA, a replay-free continual learning framework for LLMs that mitigates forgetting by regularizing low-rank adapters through subspace de-correlation, achieving superior retention, transfer, and scalability across 3 benchmarks.
Abstract: Continual Learning (CL) is a vital requirement for deploying large language models (LLMs) in today's dynamic world. Existing approaches seek to acquire task-specific knowledge via parameter-efficient fine-tuning (PEFT) with reduced compute overhead. However, sequential FT often sacrifices performance retention and forward transfer, especially under replay-free constraints. We introduce ELLA, a novel CL framework that regularizes low-rank adapter updates via cross-task subspace decorrelation. By learning a compact adapter per task and penalizing overlap between representational subspaces for past and current adapter activations, ELLA encourages task specialization while preserving prior knowledge, without storing data. Across $3$ benchmarks, ELLA outperforms prior CL methods in both accuracy and forgetting metrics, providing a scalable solution for lifelong LLM learning.
Serve As Reviewer: ~Radhika_Bhargava1
Submission Number: 5
Loading