Abstract: Identifying the parameters of a non-linear model that best explain observed data is a core task across scientific fields. When such models rely on complex simulators, evaluating the likelihood is typically intractable, making traditional inference methods such as MCMC inapplicable. Simulation-based inference (SBI) addresses this by training deep generative models to approximate the posterior distribution over parameters using simulated data. In this work, we consider the tall data setting, where multiple independent observations provide additional information, allowing sharper posteriors and improved parameter identifiability.
Building on the flourishing score-based diffusion literature, F-NPSE (Geffner et al., 2023) estimates the tall data posterior by composing individual scores from a neural network trained only for a single context observation. This enables more flexible and simulation-efficient inference than alternative approaches for tall datasets in SBI.
However, it relies on costly Langevin dynamics during sampling. We propose a new algorithm that eliminates the need for Langevin steps by explicitly approximating the diffusion process of the tall data posterior. Our method retains the advantages of compositional score-based inference while being significantly faster and more stable than F-NPSE. We demonstrate its improved performance on toy problems and standard SBI benchmarks, and showcase its scalability by applying it to a complex real-world model from computational neuroscience.
Certifications: J2C Certification
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: **Minor changes:**
- Following the action editor's recommendation, we homogenized the score function notation throughout the paper, always specifying the variable for the gradient operator.
- As requested by Reviewer So1v, we increased the size of Figures 5 and 10 and corrected all typos. We also we added a reminder of the parameter dimension m and data dimension d in Section 4.2 and removed the term "gain" for parameter g in Section 4.3
- Following Reviewer Tubi's comment, we clarified the clipping procedure in Section 4.2, specifying that samples are truncated to [-3, 3] at every step.
**Comparison with the deterministic sampler from Geffner et al. (2023):**
As promised to Reviewer p3EZ, we added both a theoretical and empirical comparison with the deterministic sampler proposed in Geffner et al. (2023, Appendix D):
- A Remark in Section 3.2 highlights the key theoretical distinction: their method composes Gaussian reverse transitions with shared isotropic variance, while ours preserves the backward covariance structure via the Tweedie framework. Our approach is exact in the Gaussian case, while theirs is not unless all covariances are isotropic.
- A Remark in Section 4 introduces the deterministic sampler (referred to as DET_GEF) as an additional empirical baseline, pointing to the appendix for full results.
- A new appendix section provides the detailed theoretical comparison, empirical results on the toy models and SBI benchmarks (with figures), and a clarification of the Langevin behavior across settings — addressing questions raised by Reviewers p3EZ and Tubi regarding the counter-intuitive performance trends of the Langevin sampler.
Code: https://github.com/JuliaLinhart/diffusions-for-sbi
Assigned Action Editor: ~Andriy_Mnih1
Submission Number: 6284
Loading