Bridging Sub-Tasks to Long-Horizon Task in Hierarchical Goal-Based Reinforcement Learning

Gangbok Lee; Youngsik Yoon; Jeongyeol Kwon; Sungsoo Ahn; Jungseul Ok

Bridging Sub-Tasks to Long-Horizon Task in Hierarchical Goal-Based Reinforcement Learning

Gangbok Lee, Youngsik Yoon, Jeongyeol Kwon, Sungsoo Ahn, Jungseul Ok

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Hierarchical Reinforcement Learning, Goal-based Reinforcement Learning

Abstract: Hierarchical goal-based reinforcement learning (HGRL) is a promising approach to learn a long-horizon task by decomposing it into a series of subtasks of achiev- ing subgoals in a shorter horizon. However, the performance of HGRL crucially depends on the design of intrinsic rewards for these subtasks: as frequently ob- served in practice, short-sighted reward designs often lead the agent into undesir- able states where the final goal is no longer achievable. One potential remedy to the issue is to provide the agent with a means to evaluate the achievability of the fi- nal goal upon the completion of the subtask; yet, evaluating this achievability over a long planning horizon is a challenging task by itself. In this work, we propose a subtask reward scheme aimed at bridging the gap between the long-horizon pri- mary goal and short-horizon subtasks by incorporating a look-ahead information towards the next subgoals. We provide an extensive empirical analysis in MuJoCo environments, demonstrating the importance of looking ahead to the subsequent sub-goals and the improvement of the proposed framework applied to the existing HGRL baselines.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5252

Loading