Rethinking Memory in Continual Learning: Beyond a Monolithic Store of the Past

Rethinking Memory in Continual Learning: Beyond a Monolithic Store of the Past

TMLR Paper5523 Authors

01 Aug 2025 (modified: 17 Oct 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Memory is a critical component in replay-based continual learning (CL). Prior research has largely treated CL memory as a monolithic store of past data, focusing on how to select and store representative past examples. However, this perspective overlooks the higher-level memory architecture that governs the interaction between old and new data. In this work, we identify and characterize a dual-memory system that is inherently present in both online and offline CL settings. This system comprises: a short-term memory, which temporarily buffers recent data for immediate model updates, and a long-term memory, which maintains a carefully curated subset of past experiences for future replay and consolidation. We propose \textit{memory capacity ratio} (MCR), the ratio between short-term memory and long-term memory capacities, to characterize online and offline CL. Based on this framework, we systematically investigate how MCR influences generalization, stability, and plasticity. Across diverse CL settings—class-incremental, task-incremental, and domain-incremental—and multiple data modalities (e.g., image and text classification), we observe that a smaller MCR, characteristic of online CL, can yield comparable or even superior performance relative to a larger one, characteristic of offline CL, when both are evaluated under equivalent computational and data storage budgets. This advantage holds consistently across several state-of-the-art replay strategies, such as ER, DER, and SCR. Theoretical analysis further reveals that a reduced MCR yields a better trade-off between stability and plasticity and lowers a bound on generalization error when learning from non-stationary data streams with limited memory. These findings offer new insights into the role of memory allocation in continual learning and underscore the underexplored potential of online CL approaches.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Robert_Legenstein1

Submission Number: 5523

Loading