Hyperbolic Deep Reinforcement Learning

Edoardo Cetin; Benjamin Paul Chamberlain; Michael M. Bronstein; Jonathan J Hunt

Hyperbolic Deep Reinforcement Learning

Edoardo Cetin, Benjamin Paul Chamberlain, Michael M. Bronstein, Jonathan J Hunt

Published: 01 Feb 2023, Last Modified: 01 Mar 2023ICLR 2023 notable top 25%Readers: Everyone

Keywords: Reinforcement learning, Hyperbolic space, Representation learning, Machine learning

TL;DR: We use hyperbolic space to model the latent representations of deep RL algorithms, attaining great performance and generalization benefits.

Abstract: In deep reinforcement learning (RL), useful information about the state is inherently tied to its possible future successors. Consequently, encoding features that capture the hierarchical relationships between states into the model's latent representations is often conducive to recovering effective policies. In this work, we study a new class of deep RL algorithms that promote encoding such relationships by using hyperbolic space to model latent representations. However, we find that a naive application of existing methodology from the hyperbolic deep learning literature leads to fatal instabilities due to the non-stationarity and variance characterizing common gradient estimators in RL. Hence, we design a new general method that directly addresses such optimization challenges and enables stable end-to-end learning with deep hyperbolic representations. We empirically validate our framework by applying it to popular on-policy and off-policy RL algorithms on the Procgen and Atari 100K benchmarks, attaining near universal performance and generalization benefits. Given its natural fit, we hope this work will inspire future RL research to consider hyperbolic representations as a standard tool.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Supplementary Material: zip

14 Replies

Loading