Keywords: uncertainty set, robust MDP
Abstract: In robust Markov Decision Processes (MDPs), the uncertainty set is often assumed to be fixed and given. However, the size of the uncertainty set is crucial due to the inherent trade-off between robustness and conservatives: a larger uncertainty set fosters a more robust solution but tends towards increased conservativeness, while a smaller set may sacrifice robustness for higher performance. In this work, we introduce a novel method to learn the size of reward uncertainty set from data. Such a data-driven approach ensures that the learned uncertainty set is large enough to cover the underlying models implied by the data while being compact to minimize conservativeness.
Submission Number: 71
Loading