Keywords: online learning, Bayesian neural networks, variational inference, natural gradient descent
TL;DR: We improve on online variational Bayes using natural gradient descent on expected log-likelihood.
Abstract: We propose a novel approach to sequential Bayesian inference based on variational Bayes (VB).
The key insight is that,
in the online setting,
we do not need to add the KL term to regularize to the prior (which comes from the posterior at the previous timestep);
instead we can optimize just the expected log-likelihood,
performing a single step of natural gradient descent
starting at the prior predictive.
We prove this method
recovers exact Bayesian inference
if the model is conjugate.
We also show how to compute an
efficient deterministic
approximation to the VB objective,
as well as our simplified objective,
when the variational distribution is
Gaussian or a sub-family, including the case of
a diagonal plus low-rank
precision matrix.
We show empirically that our
method outperforms other online VB methods
in the non-conjugate setting,
such as online learning for neural networks,
especially when controlling for computational costs.
Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)
Submission Number: 14664
Loading