Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang, Simon S. Du, Jiantao Jiao

Published: 2022, Last Modified: 12 May 2023ICML 2022Readers: Everyone

Abstract: Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms. However, theoretical understanding of these methods remains insufficient. Even in the ...

0 Replies