In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study

Brady Neal; Ioannis Mitliagkas

In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study

Brady Neal, Ioannis Mitliagkas

Published: 04 Jun 2019, Last Modified: 05 May 2023ICML Deep Phenomena 2019Readers: Everyone

Keywords: overparametrization, over-parameterization, reinforcement learning, deep reinforcement learning, generalization

TL;DR: Over-parametrization in width seems to help in deep reinforcement learning, just as it does in supervised learning.

Abstract: There is significant recent evidence in supervised learning that, in the over-parametrized setting, wider networks achieve better test error. In other words, the bias-variance tradeoff is not directly observable when increasing network width arbitrarily. We investigate whether a corresponding phenomenon is present in reinforcement learning. We experiment on four OpenAI Gym environments, increasing the width of the value and policy networks beyond their prescribed values. Our empirical results lend support to this hypothesis. However, tuning the hyperparameters of each network width separately remains as important future work in environments/algorithms where the optimal hyperparameters vary noticably across widths, confounding the results when the same hyperparameters are used for all widths.

1 Reply

Loading