Solving Compositional Reinforcement Learning Problems via Task Reduction

Yunfei Li; Yilin Wu; Huazhe Xu; Xiaolong Wang; Yi Wu

Solving Compositional Reinforcement Learning Problems via Task Reduction

Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone

Keywords: compositional task, sparse reward, reinforcement learning, task reduction, imitation learning

Abstract: We propose a novel learning paradigm, Self-Imitation via Reduction (SIR), for solving compositional reinforcement learning problems. SIR is based on two core ideas: task reduction and self-imitation. Task reduction tackles a hard-to-solve task by actively reducing it to an easier task whose solution is known by the RL agent. Once the original hard task is successfully solved by task reduction, the agent naturally obtains a self-generated solution trajectory to imitate. By continuously collecting and imitating such demonstrations, the agent is able to progressively expand the solved subspace in the entire task space. Experiment results show that SIR can significantly accelerate and improve learning on a variety of challenging sparse-reward continuous-control problems with compositional structures. Code and videos are available at https://sites.google.com/view/sir-compositional.

One-sentence Summary: We propose a deep RL algorithm for learning compositional strategies to solve sparse-reward continuous-control problems.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Code: [![github](/images/github_icon.svg) IrisLi17/self-imitation-via-reduction](https://github.com/IrisLi17/self-imitation-via-reduction)

13 Replies

Loading