On the reproducibility of gradient-based Meta-Reinforcement Learning baselines

Tristan Deleu, Simon Guiroy, Seyedarian Hosseini

Jun 11, 2018 Submission readers: everyone
  • Abstract: Meta-learning provides an appealing solution to the data-efficiency issue inherent in both deep supervised learning and (model-free) deep reinforcement learning. The diversity of tasks available in supervised meta-learning and meta-reinforcement learning enabled the fast progress we are recently observing in this field, since one can easily compare a new meta-learning method to existing algorithms. In this paper, we revisit one of these baselines on two basic meta-reinforcement learning problems: the multi-armed bandits and tabular MDPs. We provide updated results for MAML applied to these two problems, and show that MAML compares favorably to more recent meta-learning approaches, contrary to what was previously reported. Along with this baseline, we also include some new results on the same tasks for Reptile, a first-order meta-learning approach.
  • Keywords: meta-learning, reinforcement learning, maml
0 Replies