Missingness-MDPs: Bridging the Theory of Missing Data and POMDPs

15 Sept 2025 (modified: 08 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Decision and Control, Missing Data, Model-Based RL, Planning, Reinforcement Learning
TL;DR: We introduce a framework that integrates the theory of missing data into POMDP planning and propose algorithms for learning observation functions under different missingness processes.
Abstract: We introduce missingness-MDPs (miss-MDPs); a subclass of partially observable Markov decision processes (POMDPs) that incorporates the theory of missing data. Miss-MDPs capture settings where, at each step, features of the current state may go missing, that is, the state is not fully observed. Missingness of state features occurs dynamically, governed by the missingness function, a restricted observation function. In miss-MDPs, we distinguish three types of missingness functions: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). Our problem is to compute a policy for a miss-MDP with an unknown missingness function from a dataset of observations. We propose probably approximately correct (PAC) algorithms that, from a dataset, approximate the missingness function and, thereby, the true miss-MDP. We show that, for specific missingness functions, the policy computed on the approximated model is epsilon-optimal in the true miss-MDP. The empirical evaluation confirms these findings and shows that our approach becomes more sample-efficient when exploiting the type of the missingness function.
Primary Area: reinforcement learning
Submission Number: 5941
Loading