Keywords: Reinforcement Learning, Exploration, Multi-goal reinforcement learning, Multi-task reinforcement learning
TL;DR: Setting some temporary goals for a multi-goal reinforcement learning agent can help with exploration and identifying unachievable regions.
Abstract: Exploration has always been a crucial aspect of reinforcement learning. When facing long horizon sparse reward environments modern methods still struggle with effective exploration and generalize poorly. In the multi-goal reinforcement learning setting, out-of-distribution goals might appear similar to the achieved ones, but the agent can never accurately assess its ability to achieve them without attempting them. To enable faster exploration and improve generalization, we propose an exploration method that lets the agent temporarily pursue the most meaningful nearby goal. Through experiments in four multi-goal environments, including a 2D PointMaze, an AntMaze, and a foraging world, we show that our method can improve an agent's ability to estimate the achievability of out-of-distribution goals as well as its frontier exploration strategy.