Unsupervised Curricula for Visual Meta-Reinforcement LearningDownload PDF

Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Alexei A Efros, Sergey Levine, Chelsea Finn

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: Meta-reinforcement learning algorithms leverage experience across many tasks to learn fast and effective reinforcement learning (RL) algorithms. However, current meta-RL methods depend critically on a manually-defined distribution of meta-training tasks, and hand-crafting these task distributions is challenging and time-consuming. We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. Crucially, the task distribution is scaffolded by the meta-learner's behavior, with density-based exploration driving the evolution of the task distribution. We formulate unsupervised meta-RL with an information-theoretic objective optimized via expectation-maximization over trajectory-level latent variables. Repeating this procedure leads to iterative reorganization of behavior, allowing the task distribution to adapt as the meta-learner becomes more competent. In our experiments on vision-based navigation and manipulation domains, we show that our algorithm allows for unsupervised meta-learning of skills that transfer to downstream tasks specified by human-provided reward functions, as well as pre-training for more efficient meta-learning on user-defined task distributions. To understand the nature of the curricula, we provide visualizations and analysis of the task distributions discovered throughout the learning process, finding that the emergent tasks span a range of environment-specific exploratory and exploitative behavior.
Code Link: https://github.com/ajabri/autocurricula
CMT Num: 5531
0 Replies

Loading