Off-policy Predictive Control with Causal Sensitivity Analysis

Published: 07 May 2025, Last Modified: 13 Jun 2025UAI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: off-policy learning, causal inference, hidden confounding, sensitivity analysis, reinforcement learning
TL;DR: We show how to do model predictive control with hidden confounders.
Abstract: Predictive models are often deployed for decision-making tasks for which they were not explicitly trained. When only partial observations of the relevant state are available, as in most real-world applications, there is a strong possibility of hidden confounding. Therefore, partial observability often makes the outcome of an action unidentifiable, and could render a model's predictions unreliable for action planning. We present an identification bound and propose an algorithm to account for hidden confounding during model-predictive control. To that end, we introduce a generalized causal sensitivity model for action-state dynamics. We place a constraint on the hidden confounding between trajectories of future actions and states, enabling sharp bounds on interventional outcomes. Unlike previous sensitivity models, ours accommodates hidden confounding with memory, while maintaining computational and statistical tractability. We benchmark on a wide variety of multivariate stochastic differential equations with arbitrary confounding. The results suggest that a calibrated sensitivity model helps controllers achieve higher rewards.
Supplementary Material: zip
Latex Source Code: zip
Signed PMLR Licence Agreement: pdf
Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission552/Authors, auai.org/UAI/2025/Conference/Submission552/Reproducibility_Reviewers
Submission Number: 552
Loading