OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Go to
ICML 2021
homepage
Adaptive Sampling for Best Policy Identification in Markov Decision Processes
Aymen Al Marjani
,
Alexandre Proutière
2021 (modified: 25 Apr 2023)
ICML 2021
Readers:
Everyone
Abstract:
We investigate the problem of best-policy identification in discounted Markov Decision Processes (MDPs) when the learner has access to a generative model. The objective is to devise a learning algo...
0 Replies
Loading