Keywords: online learning, neural bandits, sequential decision making, Kalman filtering, frequentist, bayes, martingale posterior
TL;DR: Scalable online training of neural networks using frequentist filtering with martingale-posterior-inspired posteriors for sequential decision making.
Abstract: We introduce scalable algorithms for online learning of neural network parameters and Bayesian sequential decision making.
Unlike classical Bayesian neural networks,
which induce predictive uncertainty through a posterior over model parameters,
our methods adopt a predictive-first perspective based on martingale posteriors.
In particular, we work directly with the one-step-ahead posterior predictive, which we
parameterize with a neural network and update sequentially with incoming observations.
This decouples Bayesian decision-making from parameter-space inference:
we sample from the posterior predictive for decision making,
and update the parameters of the posterior predictive via fast, frequentist Kalman-filter-like
recursions.
Our algorithms operate in a fully online, replay-free setting, providing principled uncertainty quantification without costly posterior sampling.
Empirically, they achieve competitive performance–speed trade-offs in non-stationary contextual bandits and Bayesian optimization,
offering 10–100 times faster inference than classical Thompson sampling while maintaining comparable or superior decision performance.
Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
Submission Number: 7334
Loading