The Role of Diverse Replay for Generalisation in Reinforcement Learning

Max Weltevrede; Matthijs T. J. Spaan; Wendelin Boehmer

The Role of Diverse Replay for Generalisation in Reinforcement Learning

Max Weltevrede, Matthijs T. J. Spaan, Wendelin Boehmer

Published: 20 Jul 2023, Last Modified: 08 Jun 2025EWRL16Readers: Everyone

Keywords: Reinforcement Learning, Generalisation, Replay Buffer, Exploration

TL;DR: We examine the role of diverse replay in improving generalisation performance in reinforcement learning.

Abstract: In reinforcement learning (RL), key components of many algorithms are the exploration strategy and replay buffer. These strategies regulate what environment data is collected and trained on and have been extensively studied in the RL literature. In this paper, we investigate the impact of these components in the context of generalisation in multi-task RL. We investigate the hypothesis that collecting and training on more diverse data from the training environments will improve zero-shot generalisation to new tasks. We motivate mathematically and show empirically that generalisation to tasks that are "reachable" during training is improved by increasing the diversity of transitions in the replay buffer. Furthermore, we show empirically that this same strategy also shows improvement for generalisation to similar but "unreachable" tasks which could be due to improved generalisation of the learned latent representations.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/the-role-of-diverse-replay-for-generalisation/code)

1 Reply

Loading