Evaluating Sparse Galaxy Simulations via Out-of-Distribution Detection and Amortized Bayesian Model Comparison
Abstract: Cosmological simulations are a powerful tool to advance our understanding of
galaxy formation and many simulations model key properties of real galaxies. A
question that naturally arises for such simulations in light of high-quality observa-
tional data is: How close are the models to reality? Due to the high-dimensionality
of the problem, many previous studies evaluate galaxy simulations using simplified
summary statistics of physical properties. In this work, we combine simulation-
based Bayesian model comparison with a novel misspecification detection tech-
nique to compare simulated galaxy images of 6 hydrodynamical models against real
Sloan Digital Sky Survey (SDSS) observations. Since cosmological simulations
are computationally costly, we address the problem of low simulation budgets by
first training a k-sparse variational autoencoder (VAE) on the abundant dataset
of SDSS images. The VAE learns to extract informative latent embeddings and
delineates the typical set of real images. To reveal simulation gaps, we then perform
out-of-distribution (OOD) detection based on the logits of classifiers trained on
the embeddings of simulated images. Finally, we perform amortized Bayesian
model comparison using probabilistic classification, identifying the relatively best-
performing model along with partial explanations through SHAP values.
Loading