Deconfounded Imitation Learning

Risto Vuorio; Pim De Haan; Johann Brehmer; Hanno Ackermann; Daniel Dijkman; Taco Cohen

Deconfounded Imitation Learning

Risto Vuorio, Pim De Haan, Johann Brehmer, Hanno Ackermann, Daniel Dijkman, Taco Cohen

08 Oct 2022 (modified: 05 May 2023)Deep RL Workshop 2022Readers: Everyone

Keywords: Imitation learning, causality

TL;DR: We propose a theoretically and empirically effective algorithm for confounded imitation learning.

Abstract: Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent. This partial observability gives rise to hidden confounders in the causal graph, which lead to the failure to imitate. We break down the space of confounded imitation learning problems and identify three settings with different data requirements in which the correct imitation policy can be identified. We then introduce an algorithm for deconfounded imitation learning, which trains an inference model jointly with a latent-conditional policy. At test time, the agent alternates between updating its belief over the latent and acting under the belief. We show in theory and practice that this algorithm converges to the correct interventional policy, solves the confounding issue, and can under certain assumptions achieve an asymptotically optimal imitation performance.

Supplementary Material: zip

0 Replies

Loading