Multi-goal Reinforcement Learning via Exploring Successor Matching

Xiaoyun Feng

Published: 01 Jan 2022, Last Modified: 12 May 2023CoG 2022Readers: Everyone

Abstract: Multi-goal reinforcement learning (RL) agent aims at achieving and generalizing over various goals. Due to the sparsity of goal-reaching rewards, it suffers from unreliable value estimation and is thus unable to efficiently identify essential states towards specific goal-reaching. To deal with the problem, we propose Exploring Successor Matching (ESM), a framework that enables goal-conditioned policy and progressively encourages the multi-goal exploration towards the promising frontier. ESM adopts the idea of successor feature and extends it to goal-reaching successor mapping that serves as a more stable state feature under sparse rewards. After acquiring the successor mapping, it further explores intrinsic goals that are more likely to be achieved from a diverse set of states in terms of future state occupancies. Experiments on challenging manipulation tasks show that ESM deals well with sparse rewards and achieves better sample efficiency.

0 Replies