Cautiously Optimistic Policy Optimization and Exploration with Linear Function ApproximationDownload PDFOpen Website

2021 (modified: 18 Apr 2023)COLT 2021Readers: Everyone
Abstract: Policy optimization methods are popular reinforcement learning (RL) algorithms, because their incremental and on-policy nature makes them more stable than the value-based counterparts. However, the...
0 Replies

Loading