Reducing Sampling Error in Batch Temporal Difference Learning

Brahma S. Pavse, Ishan Durugkar, Josiah Hanna, Peter Stone

2020 (modified: 29 Sept 2023)ICML 2020Readers: Everyone

Abstract: Temporal difference (TD) learning is one of the main foundations of modern reinforcement learning. This paper studies the use of TD(0), a canonical TD algorithm, to estimate the value function of a...

0 Replies