Does Hierarchical Reinforcement Learning Outperform Standard Reinforcement Learning in Goal-Oriented Environments?

Ziyan Luo; Yijie Zhang; Zhaoyue Wang

Does Hierarchical Reinforcement Learning Outperform Standard Reinforcement Learning in Goal-Oriented Environments?

Ziyan Luo, Yijie Zhang, Zhaoyue Wang

Published: 03 Nov 2023, Last Modified: 27 Nov 2023GCRL WorkshopEveryoneRevisionsBibTeX

Confirmation: I have read and confirm that at least one author will be attending the workshop in person if the submission is accepted

Keywords: Goal-oriented, GCRL, HRL, Temporal Abstraction

TL;DR: Performance analysis on standard RL and HRL on goal-oriented domains

Abstract: Hierarchical Reinforcement Learning (HRL) targets long-horizon decision-making problems by decomposing the task into a hierarchy of subtasks. There is a plethora of HRL works that can do bottom-up temporal abstraction automatically meanwhile learning a hierarchical policy. In this study, we assess performance of standard RL and HRL within a customizable 2D Minecraft domain with varying difficulty levels. We observed that without a-prior knowledge, predefined subgoal structures and well-shaped reward structures, HRL methods surprisingly do not outperform all standard RL methods in 2D Minecraft domain. We also provide clues to elucidate the underlying reasons for this outcome, e.g., whether HRL methods, incorporating automatic temporal abstraction, can discover bottom-up action abstractions that match the intrinsic top-down task decomposition, often referred to as "goal-directed behavior" in goal-oriented environments.

Submission Number: 3

Loading