Tighter sparse variational Gaussian processes

Thang D Bui; Matthew Ashman; Richard E. Turner

Tighter sparse variational Gaussian processes

Thang D Bui, Matthew Ashman, Richard E. Turner

Published: 27 May 2025, Last Modified: 27 May 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Sparse variational Gaussian process (GP) approximations based on inducing points have become the de facto standard for scaling GPs to large datasets, owing to their theoretical elegance, computational efficiency, and ease of implementation. This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior. The key innovation is to modify the conditional posterior to have smaller variances than that of the prior at the training points. We derive the collapsed bound for the regression case, describe how to use the proposed approximation in large data settings, and discuss its application to handle orthogonally structured inducing points and GP latent variable models. Extensive experiments on regression benchmarks, classification, and latent variable models demonstrate that the proposed approximation consistently matches or outperforms standard sparse variational GPs while maintaining the same computational cost.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Update 1: Corrected parameterisation of conditional posterior. Update 2 and final update for review: Updated appendix. Update 3: added fixed q and hypers experiments, moved gplvm to appendix, included hyperparameters plots, and fixed typos. Update 4: compared $m_n$ parameterisations on a small dataset. Camera-ready update: added GitHub link, changed format, fixed typos, added authors' info.

Code: https://github.com/thangbui/tighter_sparse_gp

Assigned Action Editor: ~Geoff_Pleiss1

Submission Number: 4157

Loading