PROVABLY BENEFITS OF DEEP HIERARCHICAL RL

Zeyu Jia; Simon S. Du; Ruosong Wang; Mengdi Wang; Lin F. Yang

PROVABLY BENEFITS OF DEEP HIERARCHICAL RL

Zeyu Jia, Simon S. Du, Ruosong Wang, Mengdi Wang, Lin F. Yang

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We theoretically show an exponential improvement using Deep HRL comparing to standard RL framework.

Abstract: Modern complex sequential decision-making problem often both low-level policy and high-level planning. Deep hierarchical reinforcement learning (Deep HRL) admits multi-layer abstractions which naturally model the policy in a hierarchical manner, and it is believed that deep HRL can reduce the sample complexity compared to the standard RL frameworks. We initiate the study of rigorously characterizing the complexity of Deep HRL. We present a model-based optimistic algorithm which demonstrates that the complexity of learning a near-optimal policy for deep HRL scales with the sum of number of states at each abstraction layer whereas standard RL scales with the product of number of states at each abstraction layer. Our algorithm achieves this goal by using the fact that distinct high-level states have similar low-level structures, which allows an efficient information exploitation and thus experiences from different high-level state-action pairs can be generalized to unseen state-actions. Overall, our result shows an exponential improvement using Deep HRL comparing to standard RL framework.

Keywords: hierarchical model, reinforcement learning, low regret, online learning, tabular reinforcement learning

Original Pdf: pdf

7 Replies

Loading