A Kernel Loss for Solving the Bellman EquationDownload PDF

Yihao Feng, Lihong Li, Qiang Liu

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: Value function learning plays a central role in many state-of-the-art reinforcement learning algorithms. However, many standard algorithms like Q-learning lose convergence guarantees when function approximation is used, as is often observed in practice. In this paper, we propose a novel loss function, the minimization of which results in the true value function. The key advantage of this new loss is that its gradient can be easily approximated by using sampled transitions, avoiding the double-sample issue faced by prior algorithms like residual gradient. In practice, our approach may be combined with general (differentiable) function classes such as neural networks, and is shown to work reliably and effectively in several benchmarks.
Code Link: https://github.com/lewisKit/Kernel-Bellman-Loss
CMT Num: 8945
2 Replies

Loading