Variance Reduction for Reinforcement Learning in Input-Driven Environments

Hongzi Mao, Shaileshh Bojja Venkatakrishnan, Malte Schwarzkopf, Mohammad Alizadeh

2019 (modified: 01 May 2023)ICLR (Poster) 2019Readers: Everyone

Abstract: For environments dictated partially by external input processes, we derive an input-dependent baseline that provably reduces the variance for policy gradient methods and improves the policy performance in a wide range of RL tasks.

0 Replies