WeatherBench 2: A Benchmark for the Next Generation of Data-Driven Global Weather Models

Ian Langmore, Stephan Hoyer, Peter Battaglia, Peter Dominik Dueben, Fei Sha

Published: 19 Jun 2024, Last Modified: 21 Sept 2024Journal of advances in modeling earth systemsEveryoneCC BY 4.0

Abstract: WeatherBench 2 is an update to the global, medium-range (1–14 days) weather forecasting benchmark proposed by (Rasp et al., 2020, https://doi.org/10.1029/2020ms002203), designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and state-of-the-art models: https://sites.research.google/weatherbench. This paper describes the design principles of the evaluation framework and presents results for current state-of-the-art physical and data-driven weather models. The metrics are based on established practices for evaluating weather forecasts at leading operational weather centers. We define a set of headline scores to provide an overview of model performance. In addition, we also discuss caveats in the current evaluation setup and challenges for the future of data-driven weather forecasting.