CMDNet: A Cross-Modality Spatiotemporal Graph Network for Enhanced Air Pollution Prediction With High-Resolution Satellite Data

Published: 2025, Last Modified: 09 Nov 2025IEEE Trans. Geosci. Remote. Sens. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Predicting air pollution plays a vital role in urban management and public health by providing early warnings on PM2.5, SO2, and NO2 concentrations, helping to mitigate the adverse effects of these pollutants. Traditional prediction methods, relying on physical and statistical models, often struggle to capture the complex spatiotemporal dependencies and dynamic characteristics of air pollution data. The application of deep learning methods, especially graph neural networks (GNNs), has shown promise in addressing these limitations. However, existing GNN-based methods ignore the integration of rich semantic information provided by high-resolution satellite data. To address this problem, we propose a cross-modality dynamic spatiotemporal graph neural network (CMDNet) for air pollution prediction. The model comprises two branches: a dynamic spatiotemporal GNN (DSTGNN) branch and a remote sensing image dynamic encoding network (RSIDEN) branch. The DSTGNN branch captures the spatiotemporal dependencies in air pollution data by constructing a dynamic graph structure. The RSIDEN branch extracts semantic information from high-resolution satellite data, which improves the model’s power to perceive air pollution conditions in different regions. Experiments on real-world datasets demonstrate that the CMDNet achieves better air pollution prediction results than existing SOTA models, with maximum improvements of 2.4% (MAE), 1.8% (RMSE), 1.3% (CSI), 1.6% (FAR), and 1.4% (POD), providing more accurate prediction results.
Loading