State Entropy Maximization with Random Encoders for Efficient Exploration

Anonymous

State Entropy Maximization with Random Encoders for Efficient Exploration

Anonymous

Published: 15 Jun 2022, Last Modified: 22 Jun 2025SSL-RL 2021 PosterReaders: Everyone

Keywords: reinforcement learning, deep learning, exploration

TL;DR: We use the representation space of a random encoder to estimate state entropy, which is used as an intrinsic reward for exploration.

Abstract: Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL). However, efficient exploration in high-dimensional observation spaces still remains a challenge. This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. In order to estimate state entropy in environments with high-dimensional observations, we utilize a $k$-nearest neighbor entropy estimator in the low-dimensional representation space of a convolutional encoder. In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly initialized encoder, which is fixed throughout training. Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and navigation tasks from DeepMind Control Suite and MiniGrid benchmarks. We also show that RE3 allows learning diverse behaviors without extrinsic rewards, effectively improving sample-efficiency in downstream tasks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 6 code implementations](https://www.catalyzex.com/paper/state-entropy-maximization-with-random/code)

0 Replies

Loading