On the reproducibility of gradient-based Meta-Reinforcement Learning baselines

Tristan Deleu; Simon Guiroy; Seyedarian Hosseini

On the reproducibility of gradient-based Meta-Reinforcement Learning baselines

Tristan Deleu, Simon Guiroy, Seyedarian Hosseini

Published: 27 Jun 2018, Last Modified: 05 May 2023ICML 2018 RML SubmissionReaders: Everyone

Abstract: Meta-learning provides an appealing solution to the data-efficiency issue inherent in both deep supervised learning and (model-free) deep reinforcement learning. The diversity of tasks available in supervised meta-learning and meta-reinforcement learning enabled the fast progress we are recently observing in this field, since one can easily compare a new meta-learning method to existing algorithms. In this paper, we revisit one of these baselines on two basic meta-reinforcement learning problems: the multi-armed bandits and tabular MDPs. We provide updated results for MAML applied to these two problems, and show that MAML compares favorably to more recent meta-learning approaches, contrary to what was previously reported. Along with this baseline, we also include some new results on the same tasks for Reptile, a first-order meta-learning approach.

Keywords: meta-learning, reinforcement learning, maml

1 Reply

Loading