Online Markov Decision ProcessesOpen Website

2009 (modified: 05 Nov 2022)Math. Oper. Res. 2009Readers: Everyone
Abstract: We consider a Markov decision process (MDP) setting in which the reward function is allowed to change after each time step (possibly in an adversarial manner), yet the dynamics remain fixed. Simila...
0 Replies

Loading