Abstract: We consider multi-dimensional cost-bounded reachability probability objectives for partially observable Markov decision processes (POMDPs). The goal is to compute the maximal probability to reach a set of target states while simultaneously satisfying specified bounds on incurred costs. Such objectives generalise well-studied POMDP objectives by allowing multiple upper and lower bounds on different cost or reward measures, e.g. to naturally model scenarios where an agent acts under limited resources. We present a reduction of the multi-cost-bounded problem to unbounded reachability probabilities on an unfolding of the original POMDP. We employ a refined approach in case the agent is cost-aware-i.e., collected costs are fully observed-and also consider a setting where only partial information about the collected costs is known. Our approaches elegantly lift existing results from the fully observable MDP case to POMDPs. An empirical evaluation shows the potential of analysing POMDPs under multi-cost-bounded reachability objectives in practical settings.
External IDs:dblp:conf/uai/BorkKQS25
Loading