Toggle navigation
OpenReview
.net
Login
×
Go to
ICML 2021
homepage
Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization
Wesley Chung
,
Valentin Thomas
,
Marlos C. Machado
,
Nicolas Le Roux
2021 (modified: 26 Sept 2022)
ICML 2021
Readers:
Everyone
Abstract:
Bandit and reinforcement learning (RL) problems can often be framed as optimization problems where the goal is to maximize average performance while having access only to stochastic estimates of th...
0 Replies
Loading