Abstract: Variational data assimilation estimates the dynamical system states by minimizing a cost function that fits the numerical models with the observational data. Although four-dimensional variational assimilation (4D-Var) is widely used, it faces high computational costs in complex nonlinear systems and depends on imperfect state-observation mappings. Deep learning (DL) offers more expressive approximators, while integrating DL models into 4D-Var is challenging due to their nonlinearities and lack of theoretical guarantees in assimilation results. In this paper, we propose \textit{Tensor-Var}, a novel framework that integrates kernel conditional mean embedding (CME) with 4D-Var to linearize nonlinear dynamics, achieving convex optimization in a learned feature space. Moreover, our method provides a new perspective for solving 4D-Var in a linear way, offering theoretical guarantees of consistent assimilation results between the original and feature spaces. To handle large-scale problems, we propose a method to learn deep features (DFs) using neural networks within the Tensor-Var framework. Experiments on chaotic systems and global weather prediction with real-time observations show that Tensor-Var outperforms conventional and DL hybrid 4D-Var baselines in accuracy while achieving a 10- to 20-fold speed improvement.
Lay Summary: Data assimilation plays a key role in predicting the evolution of dynamical systems across a wide variety of engineering and scientific domains. It involves combining computer simulations with real-world observations to improve forecasting performance. One popular method, 4D-Var, is effective but can be computationally slow and has known difficulties when dealing with complex chaotic systems such as weather prediction.
We present a novel framework called Tensor-Var, which provides performance gains in relation to computational time and accuracy. Rather than relying directly on deep learning models—which can be challenging to understand and control—Tensor-Var simplifies the problem by transforming complex dynamics into a simple linear form that allows for faster, more reliable optimization. This is achieved by embedding the dynamics in a feature space using kernel embedding methods. We also integrate deep neural networks to learn scalable representations for large-scale problems. In tests on both chaotic systems and real weather data, Tensor-Var not only produced more accurate results but also ran up to 20 times faster than traditional methods. Our work paves the way to faster, more reliable assimilation and forecasts in science and engineering by combining machine learning, physical models, and observational data in a principled way.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/yyimingucl/TensorVar
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: Data assimilation, Dynamical system
Submission Number: 10431
Loading