Continuous Control with Action Quantization from Demonstrations

Robert Dadashi; Leonard Hussenot; Damien Vincent; Sertan Girgin; Anton Raichuk; Matthieu Geist; Olivier Pietquin

Continuous Control with Action Quantization from Demonstrations

Robert Dadashi, Leonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin

Published: 28 Jan 2022, Last Modified: 12 Oct 2025ICLR 2022 SubmittedReaders: Everyone

Keywords: Deep Reinforcement Learning, Action Discretization, Learning from Demonstrations

Abstract: In Reinforcement Learning (RL), discrete actions, as opposed to continuous actions, result in less complex exploration problem and the immediate derivation of the maximum of the action-value function which is central to dynamic programming-based methods. In this paper, we propose a novel method: Action Quantization from Demonstrations (AQuaDem) to learn a discretization of continuous action spaces by leveraging the priors of demonstrations. This dramatically reduces the exploration problem, since the actions faced by the agent not only are in a finite number but also are plausible in light of the demonstrator’s behavior. By discretizing the action space we can apply any discrete action deep RL algorithm to the continuous control problem. We evaluate the proposed method on three different setups: RL with demonstrations, RL with play data --demonstrations of a human playing in an environment but not solving any specific task-- and Imitation Learning. For all three setups, we only consider human data, thus most challenging than synthetic data. We found that AQuaDem consistently outperforms state-of-the-art continuous control methods, both in terms of performance and sample efficiency.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/continuous-control-with-action-quantization/code)

30 Replies

Loading