Optimizing over a Restricted Policy Class in MDPs

Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, Nikos Vlassis

2019 (modified: 24 Sept 2022)AISTATS 2019Readers: Everyone

Abstract: We address the problem of finding an optimal policy in a Markov decision process (MDP) under a restricted policy class defined by the convex hull of a set of base policies. This problem is of great...

0 Replies