Abstract: Simulation-based inference (SBI) aims to find the probabilistic inverse of a non-linear function
by fitting the posterior with a generative model on samples. Applications demand accurate
uncertainty quantification, which can be difficult to achieve and verify. Since the ground
truth model is implicitly defined in SBI, we cannot compute likelihood values nor draw
samples from the posterior. This renders two-sample testing against the posterior impossible
for any practical use and calls for proxy verification methods such as expected coverage
testing. We introduce a differentiable objective that encourages coverage in the generative
model by parameterizing the dual form of the total variation norm with neural networks.
However, we find that coverage tests can easily report a good fit when the approximant
deviates significantly from the target distribution and give strong empirical evidence and
theoretical arguments why the expected coverage plot is, in general, not a reliable indicator
of posterior fit. To address this matter, we introduce a new ratio coverage plot as a better
alternative to coverage, which is not susceptible to the same blind spots. It comes at the
price of estimating a ratio between our model and the ground truth posterior, which can be
done using standard algorithms. We provide experimental results that back up this claim,
and provide multiple algorithms for estimating ratio coverage.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Changed were made based on second reviewer feedback:
- replaced and expanded Figure 1
- renamed section "Adversarial Total Variation Norm Regularization" to "Classical Coverage Regularization Can Be Deceptive"
- changed order of sections: swapped section 4 and section 5
- moved discussion of adversary ratio regularizer into section 5 (previously section 4)
- moved sub-section from appendix to main text (now section 6.3), discussing blind spot problem for literature regularizers (with Figure 6)
- added more detail to section A.2 "Setup"
- made python source code available in section A.2 "Setup"
- rephrased "Related Work" paragraph for more precise wording
- added reference to definition of KS divergence and (Definition A.1) and push-forward measure (Tao (2021)))
- rephrased footnote on page 5
Changed were made based on reviewer feedback:
- Added explanation after Definition 2.1 and Theorem 2.3
- Changed beginning of Section 3
- Capitalized all references to Propositions, Algorithms, Theorems, etc.
- Removed sentence fragment
- Added footnote to Definition 3.4
- Added definition to Remark 3.6 of $g_#$
- Rephrased beginning of section 4
- Expanded motivation for _ratio coverage_ at the beginning of section 5
- Added reference Corollary A.10 and Figure 2
- Reformated to single line formular where possible
- Moved citations into parentheses where posssible
- Improved clarity above Equation (29) why the unregularized objective is preferable for the ratio coverage
- Rephrased hypothesis testing paragraph
- Changed section 6 name to Experiments
- Added remark about diagonal plots in the discussion of Figure 4 (page 8)
- Added reference to section A.4 at the end of subsection 6.3
- Moved Definition of KS to Appendix (Definition A.1)
- Added training data generation information to section A.2
- Added text to Section A.4 and A.5
- Added new section A.6 with Figure 12
Assigned Action Editor: ~Michael_U._Gutmann1
Submission Number: 6257
Loading