Q-Learning as a montone scheme

Lingyi Yang

Q-Learning as a montone scheme

Lingyi Yang

Published: 19 Mar 2024, Last Modified: 30 May 2024Tiny Papers @ ICLR 2024 ArchiveEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, Q-learning, monotone, numerical methods

Abstract: Stability issues with reinforcement learning methods persist. To better understand some of these stability and convergence issues involving deep reinforcement learning methods, we examine a simple linear quadratic example. We interpret the convergence criterion of exact Q-learning in the sense of a monotone scheme and discuss consequences of function approximation on monotonicity properties.

Submission Number: 101

Loading