Squared families are useful conjugate priors

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Bayesian inference, conjugate families, Gaussian processes, squared families
TL;DR: Squared families of probability distributions are conjugate prior families for many likelihoods with applications including regression.
Abstract: Squared families of probability distributions have been studied and applied in numerous machine learning contexts. Typically, they appear as likelihoods, where their advantageous computational, geometric and statistical properties are exploited for fast estimation algorithms, representational properties and statistical guarantees. Here, we investigate the use of squared families as prior beliefs in Bayesian inference. We find that they can form helpful conjugate families, often allowing for closed-form and tractable Bayesian inference and marginal likelihoods. We apply such conjugate families to Bayesian regression in feature space using end-to-end learnable neural network features. Such a setting allows for a rich multi-modal alternative to Gaussian processes with neural network features, often called deep kernel learning. We demonstrate our method on few shot learning, outperforming existing neural methods based on Gaussian processes and normalising flows.
Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
Submission Number: 9703
Loading