Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision

Lina Mezghani; Piotr Bojanowski; Karteek Alahari; Sainbayar Sukhbaatar

Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision

Lina Mezghani, Piotr Bojanowski, Karteek Alahari, Sainbayar Sukhbaatar

Published: 23 Apr 2022, Last Modified: 05 May 2023ALOE@ICLR2022 SpotlightReaders: Everyone

Keywords: Goal-Conditioned RL, Unsupervised RL

TL;DR: We propose a method for developing goal-conditioned agents that learn to discover and reach goals without environment knowledge.

Abstract: Learning a diverse set of skills by interacting with an environment without any external supervision is an important challenge. In particular, obtaining a goal-conditioned agent that can reach any given state is useful in many applications. We propose a novel method for training such a goal-conditioned agent without any external rewards or domain knowledge about the environment. The first component of our method is a \emph{reachability network} that learns to measure the similarity between two states from random interactions only. Then this reachability network is used to build the second component, a memory of past observations that are diverse and well-balanced. Finally, we train a goal-conditioned policy network, the third component, with goals sampled from the memory and reward it by the scores computed by the reachability network. All three components are kept updated throughout training as the agent explores and learns new skills. We demonstrate that our method allows training an agent for continuous control navigation, as well as robotic manipulation.

1 Reply

Loading