DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL

Nirbhay Modhe; Prithvijit Chattopadhyay; Mohit Sharma; Abhishek Das; Devi Parikh; Dhruv Batra; Ramakrishna Vedantam

DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL

Nirbhay Modhe, Prithvijit Chattopadhyay, Mohit Sharma, Abhishek Das, Devi Parikh, Dhruv Batra, Ramakrishna Vedantam

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: reinforcement learning, probabilistic inference, variational inference, intrinsic control, transfer learning

TL;DR: Identify decision states (where agent can take actions that matter) without reward supervision, use it for transfer.

Abstract: We learn to identify decision states, namely the parsimonious set of states where decisions meaningfully affect the future states an agent can reach in an environment. We utilize the VIC framework, which maximizes an agent’s `empowerment’, ie the ability to reliably reach a diverse set of states -- and formulate a sandwich bound on the empowerment objective that allows identification of decision states. Unlike previous work, our decision states are discovered without extrinsic rewards -- simply by interacting with the world. Our results show that our decision states are: 1) often interpretable, and 2) lead to better exploration on downstream goal-driven tasks in partially observable environments.

Code: https://anonymous.4open.science/r/90a4a23e-38d1-435a-8fb4-dc6795f79615/

Original Pdf: pdf

11 Replies

Loading