Deep Generative Data Assimilation in Multimodal Setting

Published: 10 Apr 2024, Last Modified: 29 Apr 2024CVPR 2024 Workshop on EarthVision 2024EveryoneCC BY 4.0
Abstract: Robust integration of physical knowledge and data is key to improve computational simulations, such as Earth sys- tem models. Data assimilation is crucial for achieving this goal because it provides a systematic framework to cali- brate model outputs with observations, which can include remote sensing imagery and ground station measurements, with uncertainty quantification. Conventional methods, in- cluding Kalman filters and variational approaches, inher- ently rely on simplifying linear and Gaussian assumptions, and can be computationally expensive. Nevertheless, with the rapid adoption of data-driven methods in many areas of computational sciences, we see the potential of emulating traditional data assimilation with deep learning, especially generative models. In particular, the diffusion-based prob- abilistic framework has large overlaps with data assimila- tion principles: both allows for conditional generation of samples with a Bayesian inverse framework. These models have shown remarkable success in text-conditioned image generation or image-controlled video synthesis. Likewise, one can frame data assimilation as observation-conditioned state calibration. In this work, we propose SLAMS: Score- based Latent Assimilation in Multimodal Setting. Specif- ically, we assimilate in-situ weather station data and ex- situ satellite imagery to calibrate the vertical temperature profiles, globally. Through extensive ablation, we demon- strate that SLAMS is robust even in low-resolution, noisy, and sparse data settings. To our knowledge, our work is the first to apply deep generative framework for multimodal data assimilation using real-world datasets; an important step for building robust computational simulators, includ- ing the next-generation Earth system models. Our code is available at: https://github.com/yongquan-qu/SLAMS.
Loading