2018 (modified: 08 Nov 2022)ICML 2018Readers: Everyone
Abstract:In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies. We show that this pr...