Discovering and Achieving Goals with World ModelsDownload PDF

Published: 22 Jul 2021, Last Modified: 05 May 2023URL 2021 OralReaders: Everyone
Keywords: unsupervised goal reaching, unsupervised rl, goal-conditioned rl, exploration, model-based rl, world models
TL;DR: We propose an unsupervised RL agent that can reach diverse and challenging image goals in locomotion, manipulation, and kitchen tasks by combining an explorer and an achiever policy, both powered with world models
Abstract: How can an artificial agent learn to solve a wide range of tasks in a complex visual environment in the absence of external supervision? We decompose this question into two problems, global exploration of the environment and learning to reliably reach situations found during exploration. We introduce the Explore Achieve Network (ExaNet), a unified solution to these by learning a world model from the high-dimensional images and using it to train an explorer and an achiever policy from imagined trajectories. Unlike prior methods that explore by reaching previously visited states, our explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever. After the unsupervised phase, ExaNet solves tasks specified by goal images without any additional learning. We introduce a challenging benchmark spanning across four standard robotic manipulation and locomotion domains with a total of over 40 test tasks. Our agent substantially outperforms previous approaches to unsupervised goal reaching and achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of our approach, we train a single general agent across four distinct environments. For videos, see https://sites.google.com/view/exanet/home.
1 Reply

Loading