Fast Global Convergence of Natural Policy Gradient Methods with Entropy RegularizationOpen Website

2022 (modified: 17 Nov 2022)Oper. Res. 2022Readers: Everyone
Abstract: Preconditioning and Regularization Enable Faster Reinforcement LearningNatural policy gradient (NPG) methods, in conjunction with entropy regularization to encourage exploration, are among the most...
0 Replies

Loading