Probabilistic sparse variational model based on Gaussian processes for energy parameter forecasting

Sergei Strijhak

Probabilistic sparse variational model based on Gaussian processes for energy parameter forecasting

Sergei Strijhak

Published: 09 Mar 2025, Last Modified: 10 Mar 2025MathAI 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: time series, probabilistic, forecating, gaussian, processes, power, value

TL;DR: Probabilistic sparse variational model based on Gaussian processes is developed for forecasting of power value

Abstract: Time series forecasting is an urgent task in engineering, climatology, economics, sociology. Time series characterize a number of physical and economic processes. Forecasting methods come in short-term, medium-term, and long-term. Statistical models usually predict values for 1-3 points ahead, which is insufficient for many tasks. For guaranteed accuracy of forecasting for a large number of points ahead, it is advisable to apply probabilistic models of forecasting in time, as they contain a confidence interval. The uncertainty of the forecasted values can be expressed using different probability measures (PDF, CDF, quantiles, intervals, variance, others). To build a probabilistic prediction model, a Gaussian regression model was used for the desired feature vector depending on the choice of data in the presence of noise magnitude. Additionally, Bayesian theory was used to calculate the posterior distribution, e.g., of the power plant power value, i.e., the value obtained after observations, which was the product of the a priori probability and likelihood. For convenience, the power forecast probability distribution was normalized using a logit-normal transformation and varied over the range [0-1]. The probabilistic sparse variational model based on Gaussian Process (SVGP) included a fully connected neural network model to find the mean function and the covariance function. The covariance function had a quadratic exponential kernel. Both the variance of time series, the covariance matrix should be optimized during the training. These parameters and some others are updated by maximizing the marginal likelihood function. Since the training time complexity of GP is O(N3), it is not suitable for applications with a large number of training data. In this context, the SVGP method was developed. By introducing M inducing points to approximate the original GP, the complexity can be reduced to O(NM2). Variational inference is introduced to reduce the computational burden. It aims to look for an approximated posterior through the minimization of Kullback Leibler divergence between the true posterior p and the variational posterior q. The minimization problem is equivalent to the maximization of the lower bound. This allowed to reduce the dimensionality of the problem and significantly save RAM resources of server. The prediction results included graphs for the desired value of energy parameter and the values of MAE, RMSE, and CRPS metrics.

Submission Number: 21

Loading