Adaptive Sampling for Best Policy Identification in Markov Decision ProcessesDownload PDFOpen Website

2021 (modified: 25 Apr 2023)ICML 2021Readers: Everyone
Abstract: We investigate the problem of best-policy identification in discounted Markov Decision Processes (MDPs) when the learner has access to a generative model. The objective is to devise a learning algo...
0 Replies

Loading