Finite Sample Analysis for Single-Loop Single-Timescale Natural Actor-Critic Algorithm

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: reinforcement learning, natural actor-critic, finite-time sample complexity
Abstract: Natural actor-critic (NAC) methods have demonstrated remarkable effectiveness in various reinforcement learning problems. However, there remains a noticeable gap in the literature regarding the finite-time analysis of this practical algorithm. Previous theoretical investigations of actor-critic techniques primarily focused on the double-loop form, involving multiple critic steps per actor step, or the two-timescale form, which employs an actor step size much slower than that of the critic. While these approaches were designed for ease of analysis, they are seldom utilized in practical applications. In this paper, we study a more practical single-loop single-timescale natural actor-critic algorithm, where step sizes are proportional and critic updates with only a single sample per actor step. Our analysis establishes a finite sample complexity of $O(1/\epsilon^4)$, ensuring the attainment of the $\epsilon$-accurate global optimal point. To the best of our knowledge, we are the first to provide finite-time convergence with the global optimality guarantee for the single-loop single-timescale natural actor-critic algorithm with linear function approximation.
Supplementary Material: pdf
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4985
Loading