Track: Full track
Keywords: reinforcement learning, intrinsic motivation, exploration
TL;DR: This paper proposes using a learned prior in a Variational Autoencoder to estimate novelty via KL divergence, improving exploration efficiency and reward accumulation in sparse-reward environments.
Abstract: Efficient exploration in reinforcement learning is challenging, especially in sparse-reward environments. Intrinsic motivation, such as rewarding state novelty, can enhance exploration. We propose an intrinsic motivation approach, called Variational Learned Priors, that uses variational state encoding to estimate novelty via the Kullback-Leibler divergence between the posterior distribution and a learned prior of a Variational Autoencoder. We assess this intrinsic reward with four different learned priors. Our results show that this method improves exploration efficiency and accelerates extrinsic reward accumulation across various domains.
Submission Number: 11
Loading