Density-Based Bonuses on Learned Representations for Reward-Free Exploration in Deep Reinforcement Learning

Omar Darwiche Domingues; Corentin Tallec; Remi Munos; Michal Valko

Density-Based Bonuses on Learned Representations for Reward-Free Exploration in Deep Reinforcement Learning

Omar Darwiche Domingues, Corentin Tallec, Remi Munos, Michal Valko

Published: 22 Jul 2021, Last Modified: 05 May 2023URL 2021 PosterReaders: Everyone

Keywords: reinforcement learning, exploration, reward-free, representation learning

Abstract: In this paper, we study the problem of representation learning and exploration in reinforcement learning. We propose a framework to compute exploration bonuses based on density estimation, that can be used with any representation learning method, and that allows the agent to explore without extrinsic rewards. In the special case of tabular Markov decision processes (MDPs), this approach mimics the behavior of theoretically sound algorithms. In continuous and partially observable MDPs, the same approach can be applied by learning a latent representation, on which a probability density is estimated.

1 Reply

Loading