Imitation Learning by Reinforcement Learning

Kamil Ciosek

Imitation Learning by Reinforcement Learning

Kamil Ciosek

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 PosterReaders: Everyone

Keywords: reinforcement learning, imitation learning, Markov Decision Process, continuous control

Abstract: Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.

One-sentence Summary: For deterministic experts, you can do imitation learning by calling an RL solver once, with a stationary reward signal.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/imitation-learning-by-reinforcement-learning/code)

20 Replies

Loading