Sample-Efficient Self-Supervised Imitation Learning

TMLR Paper81 Authors

09 May 2022 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Imitation learning allows an agent to acquire skills or mimic behaviors by observing an expert performing a given task. While imitation learning approaches successfully replicate the observed behavior, they are limited to the trajectories generated by the expert both regarding their quality and availability. In contrast, while reinforcement learning does not need a supervised signal to learn the task, it requires a lot of computation, which can result in sub-optimal policies when we are dealing with resource constraints. For addressing those issues, we propose Reinforced Imitation Learning (RIL), a method that learns optimal policies using a very small sample of expert behavior to substantially speed up the process of reinforcement learning. RIL leverages expert trajectories to learn how to mimic behavior while also learning with its own experiences in a typical reinforcement learning fashion. A thorough set of experiments show that our method outperforms both imitation and reinforcement learning methods, providing a good compromise between sample efficiency and task performance.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Olivier_Pietquin1
Submission Number: 81
Loading