Neural Discrete Reinforcement LearningDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: Deep Reinforcement Learning, Representation Learning, Action Space
TL;DR: Discrete all types of action spaces in decision making and utilize arbitrary DRL algorithm to solve them.
Abstract: Designing effective action spaces for complex environments is a fundamental and challenging problem in reinforcement learning (RL). Some recent works have revealed that naive RL algorithms utilizing well-designed handcrafted discrete action spaces can achieve promising results even when dealing with high-dimensional continuous or hybrid decision-making problems. However, elaborately designing such action spaces requires comprehensive domain knowledge. In this paper, we systemically analyze the advantages of discretization for different action spaces and then propose a unified framework, Neural Discrete Reinforcement Learning (NDRL), to automatically learn how to effectively discretize almost arbitrary action spaces. Specifically, we propose the Action Discretization Variational AutoEncoder (AD-VAE), an action representation learning method that can learn compact latent action spaces while maintain the essential properties of original environments, such as boundary actions and the relationship between different action dimensions. Moreover, we uncover a key issue that parallel optimization of the AD-VAE and online RL agents is often unstable. To address it, we further design several techniques to adapt RL agents to learned action representations, including latent action remapping and ensemble Q-learning. Quantitative experiments and visualization results demonstrate the efficiency and stability of our proposed framework for complex action spaces in various environments.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
Supplementary Material: zip
10 Replies

Loading