Thompson Sampling for Learning Parameterized Markov Decision ProcessesDownload PDFOpen Website

2015 (modified: 08 Nov 2022)COLT 2015Readers: Everyone
Abstract: We consider reinforcement learning in parameterized Markov Decision Processes (MDPs), where the parameterization may induce correlation across transition probabilities or rewards. Consequently, obs...
0 Replies

Loading