Hierarchical reinforcement learning from imperfect demonstrations through reachable coverage-based subgoal filtering

Yu Tang; Shangqi Guo; Jinhui Liu; Bo Wan; Lingling An; Jian K. Liu

Hierarchical reinforcement learning from imperfect demonstrations through reachable coverage-based subgoal filtering

Yu Tang, Shangqi Guo, Jinhui Liu, Bo Wan, Lingling An, Jian K. Liu

Published: 01 Jan 2024, Last Modified: 14 May 2025Knowl. Based Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•The use of HRLfD vastly improves the performance of RL in large and complex tasks.•We greatly alleviate the inevitable problem of imperfect demonstrations in LfD.•We propose a novel measure to discriminate negative noise demonstrations in HRLfD.•Our method outperforms various SOTAs in Maze tasks and robotic arm tasks.

Loading