2021 (modified: 19 May 2022)COLT 2021Readers: Everyone
Abstract:We consider the problem of local planning in fixed-horizon Markov Decision Processes (MDPs) with a generative model under the assumption that the optimal value function lies close to the span of a ...