The Small Batch Size Anomaly in Multistep Deep Reinforcement LearningDownload PDF

01 Mar 2023 (modified: 11 Apr 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: Reinforcement Learning, Deep Reinforcement Learning, Value based, Batch Size, Multi step learning
TL;DR: We perform an exhaustive investigation into the interplay of batch size and update horizon and uncover a surprising phenomenon: when increasing the update horizon, it is more beneficial to decrease the batch size
Abstract: We present a surprising discovery: in deep reinforcement learning, decreasing the batch size during training can dramatically improve the agent's performance when combined with multi-step learning. Both reducing batch sizes and increasing the update horizon increase the variance of the gradients, so it is quite surprising that increased variance on two fronts yields improved performance. We perform a wide range of experiments to gain a better understanding of this phenomenon, which we denote variance double-down.
4 Replies

Loading