On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

Saket Tiwari; Omer Gottesman; George Konidaris

On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

Saket Tiwari, Omer Gottesman, George Konidaris

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: geometry, deep reinforcement learning, manifold

TL;DR: We prove that the effective state set is a low dimensional manifold, under assumptions for deterministic RL, and show that both DDPG and SAC agents can effectively learn in this low dimensional space

Abstract: Advances in reinforcement learning have led to its successful application in complex tasks with continuous state and action spaces. Despite these advances in practice, most theoretical work pertains to finite state and action spaces. We propose building a theoretical understanding of continuous state and action spaces by employing a geometric lens. Central to our work is the idea that the transition dynamics induce a low dimensional manifold of reachable states embedded in the high-dimensional nominal state space. We prove that, under certain conditions, the dimensionality of this manifold is at most the dimensionality of the action space plus one. This is the first result of its kind, linking the geometry of the state space to the dimensionality of the action space. We empirically corroborate this upper bound for four MuJoCo environments. We further demonstrate the applicability of our result by learning a policy in this low dimensional representation. To do so we introduce algorithms that learns a mapping to a low dimensional representation, as a narrow hidden layer of a deep neural network, in tandem with the policy using two popular algorithms: Deep Deterministic Policy Gradient and Soft Actor Critic. Our experiments show that such a policy performs at par or better for four MuJoCo control suite tasks.

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7838

Loading