GRID: Scalable Task-Agnostic Prompt-Based Continual Learning for Language Models

GRID: Scalable Task-Agnostic Prompt-Based Continual Learning for Language Models

ICLR 2026 Conference Submission22185 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual Learning, Prompt Tuning, Large Language Models, Backward Knowledge Transfer

TL;DR: GRID improves continual learning for LLMs by reducing prompt memory and exposing latent forgetting in task-agnostic settings

Abstract: Prompt-based continual learning (CL) provides a parameter-efficient approach for adapting large language models (LLMs) across task sequences. However, most existing methods rely on task-aware inference and maintain a growing set of task-specific prompts, which introduces two major challenges: (1) severe performance degradation on earlier tasks under task-agnostic inference, and (2) limited scalability due to prompt memory accumulation as task sequences grow. In this paper, we present GRID, a unified framework designed to address these challenges. GRID incorporates a decoding mechanism that enhances backward transfer by leveraging representative inputs, automatic task identification, and constrained decoding. Furthermore, it employs a gradient-guided prompt selection strategy to compress less informative prompts into a single aggregated representation, ensuring scalable and memory-efficient continual learning. Extensive experiments on long-sequence and negative transfer benchmarks show that GRID improves average accuracy and backward transfer, achieves competitive forward transfer, and substantially reduces prompt memory usage.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 22185

Loading