Hierarchical reinforcement learning from imperfect demonstrations through reachable coverage-based subgoal filtering

Published: 01 Jan 2024, Last Modified: 14 May 2025Knowl. Based Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•The use of HRLfD vastly improves the performance of RL in large and complex tasks.•We greatly alleviate the inevitable problem of imperfect demonstrations in LfD.•We propose a novel measure to discriminate negative noise demonstrations in HRLfD.•Our method outperforms various SOTAs in Maze tasks and robotic arm tasks.
Loading