Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Ido Greenberg; Shie Mannor; Gal Chechik; Eli Meirom

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Ido Greenberg, Shie Mannor, Gal Chechik, Eli Meirom

Published: 20 Jul 2023, Last Modified: 23 Mar 2025EWRL16Readers: Everyone

Keywords: meta reinforcement learning, robust reinforcement learning, safe reinforcement learning, risk sensitive reinforcement learning

TL;DR: Making meta reinforcement learning more robust by learning to over-sample high-risk tasks throughout meta training.

Abstract: A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a meta-policy that adapts to new tasks. Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty. This limits system reliability whenever test tasks are not known in advance. In this work, we define a robust MRL objective with a controlled robustness level. Disturbingly, optimization of analogous robust objectives in RL is known to lead to both *biased gradients* and *data inefficiency*. The gradient bias is proven to disappear in MRL, which further motivates the proposed framework. The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML). RoML is a meta-algorithm that generates a robust version of any given MRL algorithm, by identifying and over-sampling harder tasks throughout training. We demonstrate that RoML achieves robust returns on multiple navigation and continuous control benchmarks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/train-hard-fight-easy-robust-meta/code)

1 Reply

Loading