Separable value functions across time-scalesDownload PDFOpen Website

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone
Abstract: In many finite horizon episodic reinforcement learning (RL) settings, it is desirable to optimize for the undiscounted return - in settings like Atari, for instance, the goal is to collect the most...
0 Replies

Loading