Does Hierarchical Reinforcement Learning Outperform Standard Reinforcement Learning in Goal-Oriented Environments?

Published: 03 Nov 2023, Last Modified: 27 Nov 2023GCRL WorkshopEveryoneRevisionsBibTeX
Confirmation: I have read and confirm that at least one author will be attending the workshop in person if the submission is accepted
Keywords: Goal-oriented, GCRL, HRL, Temporal Abstraction
TL;DR: Performance analysis on standard RL and HRL on goal-oriented domains
Abstract: Hierarchical Reinforcement Learning (HRL) targets long-horizon decision-making problems by decomposing the task into a hierarchy of subtasks. There is a plethora of HRL works that can do bottom-up temporal abstraction automatically meanwhile learning a hierarchical policy. In this study, we assess performance of standard RL and HRL within a customizable 2D Minecraft domain with varying difficulty levels. We observed that without a-prior knowledge, predefined subgoal structures and well-shaped reward structures, HRL methods surprisingly do not outperform all standard RL methods in 2D Minecraft domain. We also provide clues to elucidate the underlying reasons for this outcome, e.g., whether HRL methods, incorporating automatic temporal abstraction, can discover bottom-up action abstractions that match the intrinsic top-down task decomposition, often referred to as "goal-directed behavior" in goal-oriented environments.
Submission Number: 3