2020 (modified: 06 May 2026)ICML 2020Readers: Everyone
Abstract:It is still common to use Q-learning and temporal difference (TD) learning{—}even though they have divergence issues and sound Gradient TD alternatives exist{—}because divergence seems rare and the...