posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

Måns Magnusson; Jakob Torgander; Paul-Christian Bürkner; Lu Zhang; Bob Carpenter; Aki Vehtari

posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari

Published: 22 Jan 2025, Last Modified: 06 Mar 2025AISTATS 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We present \texttt{posteriordb}, a database with posteriors to evaluate and compare the accuracy and efficiency of general-purpose inference algorithms in probabilistic programming languages.

Abstract: The general applicability and robustness of posterior inference algorithms is critical to widely used probabilistic programming languages such as Stan, PyMC, Pyro, and Turing.jl. When designing a new inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem is evaluating its accuracy and efficiency across a range of representative target posteriors. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. We further provide a guide to the best practices in using posteriordb for algorithm evaluation and comparison. To provide a wide range of realistic posteriors, posteriordb currently comprises 120 representative models with data, and has been instrumental in developing several inference algorithms.

Submission Number: 423

Loading