Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Denis Yarats; Amy Zhang; Ilya Kostrikov; Brandon Amos; Joelle Pineau; Rob Fergus

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We design a simple and efficient model-free off-policy method for image-based reinforcement learning that matches the state-of-the-art model-based methods in sample efficiency

Abstract: Training an agent to solve control tasks directly from high-dimensional images with model-free reinforcement learning (RL) has proven difficult. The agent needs to learn a latent representation together with a control policy to perform the task. Fitting a high-capacity encoder using a scarce reward signal is not only extremely sample inefficient, but also prone to suboptimal convergence. Two ways to improve sample efficiency are to learn a good feature representation and use off-policy algorithms. We dissect various approaches of learning good latent features, and conclude that the image reconstruction loss is the essential ingredient that enables efficient and stable representation learning in image-based RL. Following these findings, we devise an off-policy actor-critic algorithm with an auxiliary decoder that trains end-to-end and matches state-of-the-art performance across both model-free and model-based algorithms on many challenging control tasks. We release our code to encourage future research on image-based RL.

Code: https://drive.google.com/file/d/1slqgCj3f8br5K6KiHUKHJBnjAhYnw3M-/view

Keywords: reinforcement learning, model-free, off-policy, image-based reinforcement learning, continuous control

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/improving-sample-efficiency-in-model-free/code)

Original Pdf: pdf

6 Replies

Loading