2020 (modified: 31 Mar 2022)ICML 2020Readers: Everyone
Abstract:In this paper, we introduce a novel form of value function, $Q(s, s’)$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s’$ and then acting optimally thereafter...