Deep Generative Spatiotemporal Engression for Probabilistic Forecasting of Epidemics

TMLR Paper8189 Authors

31 Mar 2026 (modified: 22 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Accurate and reliable forecasting of epidemic incidences is critical for public health preparedness, yet it remains a challenging task due to complex nonlinear temporal dependencies and heterogeneous spatial interactions. Often, point forecasts generated by spatiotemporal models are unreliable in assigning uncertainty to future epidemic events. Probabilistic forecasting of epidemics is therefore crucial for providing the best or worst-case scenarios rather than a simple, often inaccurate, point estimate. We present deep spatiotemporal engression methods to generate accurate and reliable probabilistic forecasts on low-frequency epidemic datasets. The proposed methods act as distributional lenses, and out-of-sample probabilistic forecasts are generated by sampling from the trained models. Our frameworks encapsulate lightweight deep generative architectures, wherein uncertainty is quantified endogenously, driven by a pre-additive noise component during model construction. We establish geometric ergodicity and asymptotic stationarity of the spatiotemporal engression processes under mild assumptions on the network weights and pre-additive noise process. Comprehensive evaluations across six epidemiological datasets over three forecast horizons demonstrate that the proposal consistently outperforms several temporal and spatiotemporal benchmarks in both point and probabilistic forecasting. Additionally, we explore the explainability of the proposal to enhance the models' practical application for informed, timely public health interventions.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We sincerely thank the learned editor and reviewers for their comments. We have updated the manuscript (updates are colored red) based on their suggestions. A summary of changes made in the revised manuscript is stated below: 1. We clearly state our contributions in the domain of probabilistic spatiotemporal forecasting in the Introduction (Sec. 1), explaining what is new from existing architectures and studies. 2. In Sec. 2, we have modified the second paragraph to be more rigorous with our framing – we also consider post-ANMs like DeepAR that allow for input-dependent predictive distributions. Moreover, we have updated Fig. 1 by adding a post-ANM DeepAR fit and our proposed MVEN fit is now in Fig. 1(C). 3. We have restructured the Background (Sec. 3) based on Reviewer 4wMP’s suggestions. We first state what distributional regression as a goal is, and what engression brings to this family (highlighting its ability of extrapolation). We also emphasize the role of pre-additive noise and energy score loss for generativity. 4. In Sec. 4, we have condensed the framing of the STEN model to make the presentation more concise, following Reviewer bwPu’s comments. 5. Based on Reviewer bwPu’s suggestions, we have added a discussion on model selection strategy in Remark 1, Sec. 4. 6. In Sec. 5, we explicitly explain each assumption and when/how they can be enforced in practice. We added two more points in Remark 3 to justify the addition of a small noise to the hidden and cell states and to state how the LSTM forget gate can be made contractive. We have also modified Sec. 5.2 to better align the theory with practical implications, being more explicit that the theoretical studies are stability results for the closed-loop (idealized) processes. 7. Based on Reviewer v4zJ’s suggestions, we have included an ablation study in Sec. 8, comparing the pre-additive vs. post-additive noise models, energy score loss vs. MSE, and spatial module vs. no spatial module. Figs. 12, 14, and Tables 14-16 have been added to support the findings. 8. Based on Reviewer 4wmP’s constructive suggestions, we have added in the Limitations and Future Directions that exploring ‘spatiotemporal extrapolability’ is a promising direction of future research. Because monotonicity-like assumptions are violated in real-world spatiotemporal setups, we do not formally claim spatiotemporal extrapolability in this study. We have also added a line to the Broader Impact Statement to acknowledge that sometimes the proposed models tend to produce overconfident intervals, which can readily be fixed through established calibration techniques. 9. The hyperparameter configurations and computational budget for the baseline models are now mentioned in Appendix E.1.
Assigned Action Editor: ~Feng_Zhou9
Submission Number: 8189
Loading