ELLA: Efficient Lifelong Learning for Adapters in Large Language Models

Shristi Das Biswas; Yue Zhang; Anwesan Pal; Radhika Bhargava; Kaushik Roy

ELLA: Efficient Lifelong Learning for Adapters in Large Language Models

Shristi Das Biswas, Yue Zhang, Anwesan Pal, Radhika Bhargava, Kaushik Roy

Published: 23 Sept 2025, Last Modified: 11 Nov 2025CCFM PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual Learning, Subspace Regularization

TL;DR: We propose ELLA, a replay-free continual learning framework for LLMs that mitigates forgetting by regularizing low-rank adapters through subspace de-correlation, achieving superior retention, transfer, and scalability across 3 benchmarks.

Abstract: Continual Learning (CL) is a vital requirement for deploying large language models (LLMs) in today's dynamic world. Existing approaches seek to acquire task-specific knowledge via parameter-efficient fine-tuning (PEFT) with reduced compute overhead. However, sequential FT often sacrifices performance retention and forward transfer, especially under replay-free constraints. We introduce ELLA, a novel CL framework that regularizes low-rank adapter updates via cross-task subspace decorrelation. By learning a compact adapter per task and penalizing overlap between representational subspaces for past and current adapter activations, ELLA encourages task specialization while preserving prior knowledge, without storing data. Across $3$ benchmarks, ELLA outperforms prior CL methods in both accuracy and forgetting metrics, providing a scalable solution for lifelong LLM learning.

Serve As Reviewer: ~Radhika_Bhargava1

Submission Number: 5

Loading