A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation.

Jalaj Bhandari, Daniel Russo, Raghav Singal

2018 (modified: 09 Nov 2022)COLT2018Readers: Everyone

Abstract: Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most wi...

0 Replies