Abstract: Long-sequence precipitation forecasting is critical for both meteorological science and smart city applications. The primary objective of this task is to predict future radar echo sequences, which provide high resolution and timely references for atmospheric precipitation distribution based on current observations. However, the chaotic nature of precipitation systems poses significant challenges in extending reliable forecast horizons. Most existing methods struggle with accuracy and clarity when extended to long-sequence predictions, such as three-hour forecasts. This is primarily due to the insufficiency of spatio-temporal information within a single modality over time. In this paper, we propose a cascading forecasting framework that adaptively extracts and integrates multimodal spatio-temporal information to support accurate and realistic long-sequence radar forecasting. Our framework includes a temporal adaptive predictor and a flow-based precipitation distribution adaptor. The predictor utilizes a multi-branch encoder-decoder architecture. This design allows it to extract meteorological sequences from multiple sources at varying scales, resulting in an initial global precipitation estimate. The core component is a carefully designed cross-attention module with a temporal adaptive layer to enhance multi-modality alignment. The initial estimate is then refined by the flow-based adaptor, which adjusts the prediction to match the target precipitation distribution, enhancing local details and correcting extreme precipitation patterns. We validated our method using real multi-source dataset for long-sequence forecasting, and the experimental results demonstrate that our approach outperforms existing state-of-the-art methods.
Loading