Efficient multi-scale Gaussian process regression for massive remote sensing data with satGP v0.1.2

Published: 31 Jul 2023, Last Modified: 11 Oct 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: Satellite remote sensing provides a global view to processes on Earth that has unique benefits compared to mak- ing measurements on the ground, such as global coverage and enormous data volume. The typical downsides are spatial and temporal gaps and potentially low data quality. Meaning- ful statistical inference from such data requires overcoming these problems and developing efficient and robust compu- tational tools. We design and implement a computationally efficient multi-scale Gaussian process (GP) software pack- age, satGP, geared towards remote sensing applications. The software is able to handle problems of enormous sizes and to compute marginals and sample from the random field condi- tioning on at least hundreds of millions of observations. This is achieved by optimizing the computation by, e.g., random- ization and splitting the problem into parallel local subprob- lems which aggressively discard uninformative data. We describe the mean function of the Gaussian process by approximating marginals of a Markov random field (MRF). Variability around the mean is modeled with a multi-scale co- variance kernel, which consists of Matérn, exponential, and periodic components. We also demonstrate how winds can be used to inform covariances locally. The covariance kernel pa- rameters are learned by calculating an approximate marginal maximum likelihood estimate, and the validity of both the multi-scale approach and the method used to learn the kernel parameters is verified in synthetic experiments. We apply these techniques to a moderate size ozone data set produced by an atmospheric chemistry model and to the very large number of observations retrieved from the Orbit- ing Carbon Observatory 2 (OCO-2) satellite. The satGP soft- ware is released under an open-source license.
Loading