Off-Policy Meta-Reinforcement Learning Based on Feature Embedding SpacesDownload PDF

12 Jun 2020 (modified: 13 Jul 2020)ICML 2020 Workshop LifelongML Blind SubmissionReaders: Everyone
  • Student First Author: No
  • Keywords: reinforcement learning, meta-learning, amortized inference, transfer learning
  • Abstract: Meta-reinforcement learning (RL) addresses the problem of sample inefficiency in deep RL by using experience obtained in past tasks for a new task to be solved. However, most meta-RL methods require partially or fully on-policy data, i.e., they cannot reuse the data collected by past policies, which hinders the improvement of sample efficiency. To alleviate this problem, we propose a novel off-policy meta-RL method, embedding learning and evaluation of uncertainty (ELUE). ELUE is characterized by the learning of a shared feature embedding space among tasks. It learns beliefs over the embedding space and a belief conditional policy and Q-function. This approach has two major advantages. It can evaluate the uncertainty of tasks, which is expected to contribute to precise exploration, and it can also improve its performance by updating a belief. We show that our proposed method outperforms existing methods through experiments with a meta-RL benchmark.
  • TL;DR: a novel off-policy meta-RL method which learns beliefs over a feature embedding space among tasks
0 Replies