Keywords: deep learning, reinforcement learning, meta learning, robustness
TL;DR: We propose an adversarial training regime for the meta-RL domain outperforming the straightforward training regime in many environments.
Abstract: Recent years have seen tremendous progress in methods of reinforcement learning. However, most of these approaches have been trained in a straightforward fashion and are generally not robust to adversity, especially in the meta-RL setting. To the best of our knowledge, our work is the first to propose an adversarial training regime for Multi-Task Reinforcement Learning, which requires no manual intervention or domain knowledge of the environments. Our experiments on multiple environments in the Multi-Task Reinforcement learning domain demonstrate that the adversarial process leads to a better exploration of numerous solutions and a deeper understanding of the environment. We also adapt existing measures of causal attribution to draw insights from the skills learned, facilitating easier re-purposing of skills for adaptation to unseen environments and tasks.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2201.11783/code)