Adjusting Model Size in Continual Gaussian Processes: How Big is Big Enough?

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
TL;DR: The paper addresses the challenge of determining model size in machine learning, focusing on online Gaussian processes. It presents a method for automatically adjusting model capacity during training while ensuring high performance.
Abstract: Many machine learning models require setting a parameter that controls their size before training, e.g. number of neurons in DNNs, or inducing points in GPs. Increasing capacity typically improves performance until all the information from the dataset is captured. After this point, computational cost keeps increasing, without improved performance. This leads to the question "How big is big enough?" We investigate this problem for Gaussian processes (single-layer neural networks) in continual learning. Here, data becomes available incrementally, and the final dataset size will therefore not be known before training, preventing the use of heuristics for setting a fixed model size. We develop a method to automatically adjust model size while maintaining near-optimal performance. Our experimental procedure follows the constraint that any hyperparameters must be set without seeing dataset properties, and we show that our method performs well across diverse datasets without the need to adjust its hyperparameter, showing it requires less tuning than others.
Lay Summary: How can we build machine learning models that are just the right size, not too small, not too big, especially when we do not know how much data we will have? Many ML models require a size parameter that is fixed before training, e.g. number of neurons in DNNs, or inducing points in GPs. Choosing this size is important: if the model is too small, it will not perform well; if it is too large, it wastes computational resources. This raises a key question: _how big is big enough?_ We focus on this problem for a specific type of model called a Gaussian Process in a continual learning setting, where data arrives over time and cannot be stored. Since we do not know in advance how much data there will be, we cannot set the best model size at the start of training. We propose a new method that automatically adjusts the model's size as new data arrive, aiming to maintain near-optimal performance without unnecessary computation. Our results show that this method performs well across diverse datasets without the need to adjust its hyperparameter, demonstrating that it requires less tuning than other approaches.
Link To Code: https://github.com/guiomarpescador/vips
Primary Area: Probabilistic Methods
Keywords: continual learning, gaussian processes, adaptive model capacity, variational inference, inducing points, sparse approximations, bayesian nonparametrics
Submission Number: 10170
Loading