Separable value functions across time-scales

Joshua Romoff, Peter Henderson, Ahmed Touati, Yann Ollivier, Joelle Pineau, Emma Brunskill

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone

Abstract: In many finite horizon episodic reinforcement learning (RL) settings, it is desirable to optimize for the undiscounted return - in settings like Atari, for instance, the goal is to collect the most...

0 Replies