BYOL-Explore: Exploration by Bootstrapped Prediction

Zhaohan Daniel Guo; Shantanu Thakoor; Miruna Pislar; Bernardo Avila Pires; Florent Altché; Corentin Tallec; Alaa Saade; Daniele Calandriello; Jean-Bastien Grill; Yunhao Tang; Michal Valko; Remi Munos; Mohammad Gheshlaghi Azar; Bilal Piot

BYOL-Explore: Exploration by Bootstrapped Prediction

Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Avila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Remi Munos, Mohammad Gheshlaghi Azar, Bilal Piot

Published: 31 Oct 2022, Last Modified: 20 Dec 2022NeurIPS 2022 AcceptReaders: Everyone

Keywords: Exploration, Deep Reinforcement Learning, Representation Learning, Self-Supervised Learning

Abstract: We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually complex environments. BYOL-Explore learns the world representation, the world dynamics and the exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually rich 3-D environment. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents.

Supplementary Material: zip

12 Replies

Loading