Mutual Information State Intrinsic Control

Rui Zhao; Yang Gao; Pieter Abbeel; Volker Tresp; Wei Xu

Mutual Information State Intrinsic Control

Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 SpotlightReaders: Everyone

Keywords: Intrinsically Motivated Reinforcement Learning, Intrinsic Reward, Intrinsic Motivation, Deep Reinforcement Learning, Reinforcement Learning

Abstract: Reinforcement learning has been shown to be highly successful at many challenging tasks. However, success heavily relies on well-shaped rewards. Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function. Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself, and propose a new intrinsic objective that encourages the agent to have maximum control on the environment. We mathematically formalize this reward as the mutual information between the agent state and the surrounding state under the current agent policy. With this new intrinsic motivation, we are able to outperform previous methods, including being able to complete the pick-and-place task for the first time without using any task reward. A video showing experimental results is available at https://youtu.be/AUCwc9RThpk.

One-sentence Summary: Motivated by the self-consciousness concept in psychology, we propose a new intrinsic objective that encourages the agent to have maximum control on the environment.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Code: [![github](/images/github_icon.svg) ruizhaogit/music](https://github.com/ruizhaogit/music) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=OthEq8I5v1)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/mutual-information-state-intrinsic-control/code)

14 Replies

Loading