Abstract: In reinforcement learning (RL), experience replay-based sampling techniques are crucial in promoting convergence by eliminating spurious correlations. However, widely used methods such as uniform experience replay (UER) and prioritized experience replay (PER) have been shown to have sub-optimal convergence and high seed sensitivity, respectively. To address these issues, we propose a novel approach called Introspective Experience Replay (IER) that selectively samples batches of data points prior to surprising events. Our method is inspired from the reverse experience replay (RER) technique, which has been shown to reduce bias in the output of Q-learning-type algorithms with linear function approximation. However, RER is not always practically reliable when using neural function approximation. Through empirical evaluations, we demonstrate that IER with neural function approximation yields reliable and superior performance compared to UER, PER, and hindsight experience replay (HER) across most tasks.
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/google-research/look-back-when-surprised/tree/main
Supplementary Material: zip
Assigned Action Editor: ~Amir-massoud_Farahmand1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1550
Loading