Causally Correct Partial Models for Reinforcement Learning

Danilo J. Rezende; Ivo Danihelka; George Papamakarios; Nan Rosemary Ke; Ray Jiang; Theophane Weber; Karol Gregor; Hamza Merzic; Fabio Viola; Jane Wang; Jovana Mitrovic; Frederic Besse; Ioannis Antonoglou; Lars Buesing; Julian Schrittwieser; Thomas Hubert; David Silver

Causally Correct Partial Models for Reinforcement Learning

Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing, Julian Schrittwieser, Thomas Hubert, David Silver

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: Causally correct partial models do not have to generate the whole observation to remain causally correct in stochastic environments.

Abstract: In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don't model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, but avoid the need to fully model future observations.

Keywords: causality, model-based reinforcement learning

Original Pdf: pdf

12 Replies

Loading