Keywords: Weather Forecasting, Global Medium Range Weather Forecasting, AI4Science, Efficient ML
TL;DR: A competetive global medium-range weather forecasting model trained in 3 days on one A100
Abstract: Data-driven weather forecasting models now rival or exceed the skill of traditional Numerical Weather Prediction (NWP) techniques. However, the majority of the state-of-the-art models operate at a scale exclusive to well-resourced institutions, often necessitating large multi-GPU clusters and prohibitive training budgets. This centralisation restricts smaller academic groups and startups from not only adapting models for specialised downstream tasks but also from contributing to core research on efficient forecasting methodologies. In this work, we introduce Otter Weather, a streamlined, deterministic forecasting model designed to democratise access to high-performance AI weather prediction. Trained on ERA5 reanalysis data and evaluated against standard WeatherBench metrics, our model achieves a $9.4$% improvement over the best available deterministic NWP model at a 24h lead time. Notably, Otter is trained in three A100-days (corresponding to $70 in commercial computing costs) and is competitive with significantly more expensive SOTA methods, advancing the skill-versus-compute Pareto frontier for deterministic models. By prioritising architectural simplicity and incorporating optimised techniques from language modelling and computer vision with minimal task-specific inductive biases, Otter Weather offers a highly efficient, adaptable foundation for the broader scientific community. Finally, through comprehensive ablations of architectural, optimisation, and regularisation choices, we provide a practical recipe for efficient model training.
Submission Number: 41
Loading