Self-Supervised State-Control through Intrinsic Mutual Information Rewards

Rui Zhao; Volker Tresp; Wei Xu

Self-Supervised State-Control through Intrinsic Mutual Information Rewards

Rui Zhao, Volker Tresp, Wei Xu

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Intrinsic Reward, Deep Reinforcement Learning, Skill Discovery, Mutual Information, Self-Supervised Learning, Unsupervised Learning

TL;DR: This paper introduces Mutual Information-based State-Control, a self-supervised reinforcement learning framework for discovering robotic manipulation skills.

Abstract: Learning to discover useful skills without a manually-designed reward function would have many applications, yet is still a challenge for reinforcement learning. In this paper, we propose Mutual Information-based State-Control (MISC), a new self-supervised Reinforcement Learning approach for learning to control states of interest without any external reward function. We formulate the intrinsic objective as rewarding the skills that maximize the mutual information between the context states and the states of interest. For example, in robotic manipulation tasks, the context states are the robot states and the states of interest are the states of an object. We evaluate our approach for different simulated robotic manipulation tasks from OpenAI Gym. We show that our method is able to learn to manipulate the object, such as pushing and picking up, purely based on the intrinsic mutual information rewards. Furthermore, the pre-trained policy and mutual information discriminator can be used to accelerate learning to achieve high task rewards. Our results show that the mutual information between the context states and the states of interest can be an effective ingredient for overcoming challenges in robotic manipulation tasks with sparse rewards. A video showing experimental results is available at https://youtu.be/cLRrkd3Y7vU

Code: https://github.com/misc-project/misc

Original Pdf: pdf

8 Replies

Loading