TL;DR: How to leverage efficient product kernels on gridded data with missing values for scalable GP regression
Abstract: Applying Gaussian processes (GPs) to very large datasets remains a challenge due to limited computational scalability. Matrix structures, such as the Kronecker product, can accelerate operations significantly, but their application commonly entails approximations or unrealistic assumptions. In particular, the most common path to creating a Kronecker-structured kernel matrix is by evaluating a product kernel on gridded inputs that can be expressed as a Cartesian product. However, this structure is lost if any observation is missing, breaking the Cartesian product structure, which frequently occurs in real-world data such as time series. To address this limitation, we propose leveraging latent Kronecker structure, by expressing the kernel matrix of observed values as the projection of a latent Kronecker product. In combination with iterative linear system solvers and pathwise conditioning, our method facilitates inference of exact GPs while requiring substantially fewer computational resources than standard iterative methods. We demonstrate that our method outperforms state-of-the-art sparse and variational GPs on real-world datasets with up to five million examples, including robotics, automated machine learning, and climate applications.
Lay Summary: Gaussian processes are a flexible machine learning model, but it can be difficult to apply them to large amounts of data because they require a lot of computation. For specific types of data, these computations can be done efficiently. For example, if the data consists of temperature measurements across different cities and days, we can simplify the computations by considering cities and days separately. However, this typically requires that a temperature measurement is available for each pair of city and day. We introduce a way to make computations efficient even if, for example, temperature measurements are not available for certain cities on some days. In experiments using real-world robotics, machine learning, and climate data, we demonstrate that our method is faster and performs better than other scalable alternatives.
Link To Code: https://github.com/jandylin/Latent-Kronecker-GPs
Primary Area: Probabilistic Methods->Gaussian Processes
Keywords: Gaussian process, Kronecker, Bayesian, scalable, product kernel, gridded data, missing values
Submission Number: 12640
Loading