Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy OptimizationDownload PDFOpen Website

2021 (modified: 26 Sept 2022)ICML 2021Readers: Everyone
Abstract: Bandit and reinforcement learning (RL) problems can often be framed as optimization problems where the goal is to maximize average performance while having access only to stochastic estimates of th...
0 Replies

Loading