Uncertainty-Based Experience Replay for Task-Agnostic Continual Reinforcement Learning

Adrian Remonda; Cole Corbitt Terrell; Eduardo E. Veas; Marc Masana

Uncertainty-Based Experience Replay for Task-Agnostic Continual Reinforcement Learning

Adrian Remonda, Cole Corbitt Terrell, Eduardo E. Veas, Marc Masana

Published: 22 Feb 2025, Last Modified: 22 Feb 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Model-based reinforcement learning uses a learned dynamics model to imagine actions and select those with the best expected outcomes. An experience replay buffer collects the outcomes of all actions executed in the environment, which is then used to iteratively train the dynamics model. However, as the complexity and scale of tasks increase, training times and memory requirements can grow drastically without necessarily retaining useful experiences. Continual learning proposes a more realistic scenario where tasks are learned in sequence, and the replay buffer can help mitigate catastrophic forgetting. However, it is not realistic to expect the buffer to infinitely grow as the sequence advances. Furthermore, storing every single experience executed in the environment does not necessarily provide a more accurate model. We argue that the replay buffer needs to have the minimal necessary size to retain relevant experiences that cover both common and rare states. Therefore, we propose using an uncertainty-based replay buffer filtering to enable an effective implementation of continual learning agents using model-based reinforcement learning. We show that the combination of the proposed strategies leads to reduced training times, smaller replay buffer size, and less catastrophic forgetting, all while maintaining performance.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: Dear Reviewers, This is the camera ready version. Best Regards

Assigned Action Editor: ~Erin_J_Talvitie1

Submission Number: 3378

Loading