Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces

Takahisa Imagawa; Takuya Hiraoka; Yoshimasa Tsuruoka

Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces

Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka

12 Jun 2020 (modified: 05 May 2023)LifelongML@ICML2020Readers: Everyone

Student First Author: No

Keywords: reinforcement learning, meta-learning, amortized inference, transfer learning

Abstract: Meta-reinforcement learning (RL) addresses the problem of sample inefficiency in deep RL by using experience obtained in past tasks for a new task to be solved. However, most meta-RL methods require partially or fully on-policy data, i.e., they cannot reuse the data collected by past policies, which hinders the improvement of sample efficiency. To alleviate this problem, we propose a novel off-policy meta-RL method, embedding learning and evaluation of uncertainty (ELUE). ELUE is characterized by the learning of a shared feature embedding space among tasks. It learns beliefs over the embedding space and a belief conditional policy and Q-function. This approach has two major advantages. It can evaluate the uncertainty of tasks, which is expected to contribute to precise exploration, and it can also improve its performance by updating a belief. We show that our proposed method outperforms existing methods through experiments with a meta-RL benchmark.

TL;DR: a novel off-policy meta-RL method which learns beliefs over a feature embedding space among tasks

0 Replies

Loading