Tighter sparse variational Gaussian processes

TMLR Paper4157 Authors

07 Feb 2025 (modified: 11 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Sparse variational Gaussian process (GP) approximations based on inducing points have become the de facto standard for scaling GPs to large datasets, owing to their theoretical elegance, computational efficiency, and ease of implementation. This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior. The key innovation is to modify the conditional posterior to have smaller variances than that of the prior at the training points. We derive the collapsed bound for the regression case, describe how to use the proposed approximation in large data settings, and discuss its application to handle orthogonally structured inducing points and GP latent variable models. Extensive experiments on regression benchmarks, classification, and latent variable models demonstrate that the proposed approximation consistently matches or outperforms standard sparse variational GPs while maintaining the same computational cost. An implementation will be made available in all popular GP packages.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Update 1: Corrected parameterisation of conditional posterior. Update 2 and final update for review: Updated appendix. Update 3: added fixed q and hypers experiments, moved gplvm to appendix, included hyperparameters plots, and fixed typos. Update 4: compared $m_n$ parameterisations on a small dataset.
Assigned Action Editor: ~Geoff_Pleiss1
Submission Number: 4157
Loading