Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods

Ching-An Cheng, Xinyan Yan, Byron Boots

2019 (modified: 18 Apr 2023)CoRL 2019Readers: Everyone

Abstract: Policy gradient methods have demonstrated success in reinforcement learning tasks with high-dimensional continuous state and action spaces. But they are also notoriously sample inefficient, which c...

0 Replies