Technical Note Q-Learning

Christopher J. C. H. Watkins, Peter Dayan

Published: 1992, Last Modified: 18 Feb 2025Mach. Learn. 1992EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.