The Convergence of TD(lambda) for General lambda

Published: 01 Jan 1992, Last Modified: 18 Feb 2025Mach. Learn. 1992EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The method of temporal differences (TD) is one way of making consistent predictions about the future. This paper uses some analysis of Watkins (1989) to extend a convergence theorem due to Sutton (1988) from the case which only uses information from adjacent time steps to that involving information from arbitrary ones.
Loading