Addressing Misspecification in Simulation-based Inference through Data-driven Calibration

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 oralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Driven by steady progress in deep generative modeling, simulation-based inference (SBI) has emerged as the workhorse for inferring the parameters of stochastic simulators. However, recent work has demonstrated that model misspecification can harm SBI's reliability, preventing its adoption in important applications where only misspecified simulators are available. This work introduces robust posterior estimation (RoPE), a framework that overcomes model misspecification with a small real-world calibration set of ground truth parameter measurements. We formalize the misspecification gap as the solution of an optimal transport (OT) problem between learned representations of real-world and simulated observations, allowing RoPE to learn a model of the misspecification without placing additional assumptions on its nature. RoPE shows how the calibration set and OT together offer a controllable balance between calibrated uncertainty and informative inference even under severely misspecified simulators. Results on four synthetic tasks and two real-world problems with ground-truth labels demonstrate that RoPE outperforms baselines and consistently returns informative and calibrated credible intervals.
Lay Summary: When scientists use computer simulations to understand real-world systems, the simulations are often imperfect and can lead to incorrect conclusions. This paper introduces a method called RoPE that helps fix these problems by using a small amount of real-world data to adjust the simulation results. This way, the method gives more accurate and trustworthy answers, even when the original simulation isn't perfect.
Primary Area: Probabilistic Methods->Bayesian Models and Methods
Keywords: Simulation-based Inference, Misspecification, Optimal Transport, Bayesian Inference, Neural Posterior Estimation, Robust Inference
Flagged For Ethics Review: true
Submission Number: 11889
Loading