Trajectory-wise Control Variates for Variance Reduction in Policy Gradient MethodsDownload PDFOpen Website

2019 (modified: 18 Apr 2023)CoRL 2019Readers: Everyone
Abstract: Policy gradient methods have demonstrated success in reinforcement learning tasks with high-dimensional continuous state and action spaces. But they are also notoriously sample inefficient, which c...
0 Replies

Loading