Explore with Dynamic Map: Graph Structured Reinforcement LearningDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Deep Reinforcement Learning, Graph Structured Reinforcement Learning, Exploration
Abstract: In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation. Even so, learning and planning on such a map within a sparse environment remains a challenge. As a step towards this goal, we propose Graph Structured Reinforcement Learning (GSRL), which utilizes historical trajectories to slowly adjust exploration directions and learn related experiences while rapidly updating the value function estimation. GSRL constructs a dynamic graph on top of state transitions in the replay buffer based on historical trajectories, and develops an attention strategy on the map to select an appropriate goal direction, which decomposes the task of reaching a distant goal state into a sequence of easier tasks. We also leverage graph structure to sample related trajectories for efficient value learning. Results demonstrate that GSRL can outperform the state-of-the-art algorithms in terms of sample efficiency on benchmarks with sparse reward functions.
One-sentence Summary: In this paper, we propose to construct a dynamic graph on top of state transitions in replay buffer and leverage graph structure for learning and planning.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=gScVGnIzDn
12 Replies

Loading