On Evaluating Policies for Robust POMDPs

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robust POMDPs, Policy evaluation, Planning under uncertainty
TL;DR: To advance evaluation of RPOMDP policies, we (1) introduce a formalization for suitable benchmarks, (2) define a sound evaluation method, and (3) lift existing POMDP value bounds to RPOMDPs.
Abstract: Robust partially observable Markov decision processes (RPOMDPs) model partially observable sequential decision-making problems where an agent must be $\textit{robust}$ against a range of dynamics. RPOMDPs can be viewed as two-player games between an agent, which selects actions, and $\textit{nature}$, which adversarially selects dynamics. Evaluating an agent policy requires finding an adversarial nature policy, which is computationally challenging. In this paper, we advance the evaluation of agent policies for RPOMDPs in three ways. First, we discuss suitable benchmarks. We observe that for some RPOMDPs, an optimal agent policy can be found by considering only subsets of nature policies, making them easier to solve. We formalize this concept of $\textit{solvability}$ and construct three benchmarks that are only solvable for expressive sets of nature policies. Second, we describe a provably sound method to evaluate agent policies for RPOMDPs by solving an equivalent MDP. Third, we lift two well-known POMDP upper value bounds to RPOMDPs, which can be used to efficiently approximate the optimality gap of a policy and serve as baselines. Our experimental evaluation shows that (1) our proposed benchmarks cannot be solved by assuming naive nature policies, (2) our method of evaluating policies is accurate, and (3) the approximations provide solid baselines for evaluation.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Maris_F._L._Galesloot1
Track: Regular Track: unpublished work
Submission Number: 33
Loading