2022 (modified: 30 Mar 2022)ALT 2022Readers: Everyone
Abstract:We develop a model selection approach to tackle reinforcement learning with adversarial corruption in both transition and reward. For finite-horizon tabular MDPs, without prior knowledge on the tot...