Provable Convergence of Single-Timescale Neural Actor-Critic in Continuous Spaces

Xuyang Chen; Fengzhuo Zhang; Guojian Wang; Lin Zhao

Provable Convergence of Single-Timescale Neural Actor-Critic in Continuous Spaces

Xuyang Chen, Fengzhuo Zhang, Guojian Wang, Lin Zhao

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Single-Timescale Actor-Critic, Continuous State-Action Space, Deep Neural Networks

TL;DR: We establish the finite-time convergence of single-timescale neural actor-critic in continuous state-action space

Abstract: Actor-critic (AC) algorithms have been the powerhouse behind many successful yet challenging applications. However, the theoretical understanding of finite-time convergence in AC's most practical form remains elusive. Existing research often oversimplifies the algorithm and only considers simple finite state and action spaces. We analyze the more practical single-timescale AC on continuous state and action spaces and use deep neural network approximations for both critic and actor. Our analysis reveals that the iterates of the more practical framework we consider converge towards the stationary point at rate $\widetilde{\mathcal{O}}(T^{-1/2})+\widetilde{\mathcal{O}}(m^{-1/2})$, where $T$ is the total number of iterations and $m$ is the width of the deep neural network. To our knowledge, this is the first finite-time analysis of single-timescale AC in continuous state and action spaces, which further narrows the gap between theory and practice.

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6608

Loading