Meta Reinforcement Learning for Fast Adaptation of Hierarchical Policies

David Kuric; Herke van Hoof

Meta Reinforcement Learning for Fast Adaptation of Hierarchical Policies

David Kuric, Herke van Hoof

21 May 2021 (modified: 05 May 2023)NeurIPS 2021 SubmittedReaders: Everyone

Keywords: Hierarchical Reinforcement Learning, Meta-learning, Reinforcement Learning

TL;DR: A meta-learning approach for learning options that facilitate fast adaptation to a family of tasks.

Abstract: Hierarchical methods have the potential to allow reinforcement learning to scale to larger environments. Decomposing a task into transferable components, however, remains a challenging problem. In this paper, we propose a meta-learning approach for learning such a decomposition within the options framework. We formulate the objective as a bi-level optimization problem in which sub-policies and their terminations should facilitate fast learning on a family of tasks. Once such a set of options is obtained, it can then be used in new tasks where only the sequencing of options needs to be chosen. Our formalism tends to result in options where fewer decisions are needed to solve such new tasks. Experimentally, we show that our method is able to learn transferable components which accelerate learning and performs better than existing methods developed for this setting in the challenging ant maze locomotion task.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: zip

7 Replies

Loading