Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards

Rati Devidze; Parameswaran Kamalaruban; Adish Singla

Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards

Rati Devidze, Parameswaran Kamalaruban, Adish Singla

Published: 31 Oct 2022, Last Modified: 08 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: reward shaping, intrinsic rewards, reinforcement learning, sparse-reward environments

TL;DR: We propose a novel framework, Exploration-Guided Reward Shaping, that operates in a fully self-supervised manner and can accelerate an agent's learning even in sparse-reward environments.

Abstract: We study the problem of reward shaping to accelerate the training process of a reinforcement learning agent. Existing works have considered a number of different reward shaping formulations; however, they either require external domain knowledge or fail in environments with extremely sparse rewards. In this paper, we propose a novel framework, Exploration-Guided Reward Shaping (ExploRS), that operates in a fully self-supervised manner and can accelerate an agent's learning even in sparse-reward environments. The key idea of ExploRS is to learn an intrinsic reward function in combination with exploration-based bonuses to maximize the agent's utility w.r.t. extrinsic rewards. We theoretically showcase the usefulness of our reward shaping framework in a special family of MDPs. Experimental results on several environments with sparse/noisy reward signals demonstrate the effectiveness of ExploRS.

Supplementary Material: pdf

18 Replies

Loading