Abstract: Sparse reward problems present a challenge for reinforcement learning (RL) agents. Previous work has shown that choosing start states according to a curriculum can significantly improve the learning performance. We observe that many existing curriculum generation algorithms rely on two key components: Performance measure estimation and a start selection policy. Therefore, we propose a unifying framework for performance-based start state curricula in RL, which allows to analyze and compare the performance influence of the two key components. Furthermore, a new start state selection policy using spatial performance measure gradients is introduced. We conduct extensive empirical evaluations to compare performance-based start state curricula and investigate the influence of performance measure model choice and estimation. Benchmarking on difficult robotic navigation tasks and a high-dimensional robotic manipulation task, we demonstrate state-of-the-art performance of our novel spatial gradient curriculum.
0 Replies
Loading