A Systematic Evaluation of Out-of-Distribution Generalization in Crop Yield Prediction

Published: 17 Mar 2026, Last Modified: 17 Mar 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Accurate crop yield forecasting under shifting climatic conditions is essential for food security and agricultural resilience. While recent deep learning models achieve strong performance in in-domain settings, their ability to generalize across space and time—critical for real-world deployment—remains poorly understood. In this work, we present the first systematic evaluation of temporally-aware crop yield prediction models under spatio-temporal out-of-distribution (OOD) conditions, using corn and soybean data across more than 1,200 U.S. counties. We benchmark two representative architectures, GNN-RNN and MMST-ViT, using rigorous evaluation strategies including year-ahead forecasting, leave-one-region-out validation, and stratified OOD scenarios of varying difficulty based on USDA Farm Resource Regions. Our comprehensive analysis reveals significant performance gaps across agro-ecological zones, with some models showing negative R² values under distribution shift. We uncover asymmetric transferability patterns and identify the Prairie Gateway region as consistently challenging for generalization. These findings challenge prior generalizability claims and provide practical insights for deploying agricultural AI systems under climate variability.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We thank the reviewers and action editor for their constructive feedback. The following changes have been made in response to the review comments: Title updated: Removed "Climate-Aware" from the title as suggested by Reviewer DhdB, since the work focuses on OOD generalization rather than climate change modeling per se. Section 1 revised: The classification of DL models on page 2 has been rewritten to emphasize that remote sensing and meteorological inputs are complementary modalities rather than mutually exclusive categories. Citations added: Explicit in-text citations to \citet{fan2022gnnrnn} and \citet{lin2023mmstvit} have been added at the point of first reproduction in Section 5.1, as requested. Table references added: Tables 5 and 6 (pairwise RMSE matrices) are now explicitly referenced in Section 5.3, with an explanatory sentence clarifying their role in revealing asymmetric OOD transferability patterns. Color scale added: A color scale legend has been added below Table 5 specifying RMSE thresholds for both soybean and corn. Table 6 references this scale. Table 1 clarified: The caption and a follow-up sentence now explain why each experiment is necessary for a robust OOD evaluation protocol. Conclusion rephrased: Overly strong wording ("climate-resilient agriculture") has been replaced with more precise language ("improving crop yield forecasting").
Assigned Action Editor: ~Jacek_Cyranka1
Submission Number: 5960
Loading