Sample-Optimal Parametric Q-Learning Using Linearly Additive FeaturesDownload PDFOpen Website

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone
Abstract: Consider a Markov decision process (MDP) that admits a set of state-action features, which can linearly express the process’s probabilistic transition model. We propose a parametric Q-learning algo...
0 Replies

Loading