OceanBench: A Benchmark for Data-Driven Global Ocean Forecasting systems

Anass El Aouni; Quentin Gaudel; Juan Emmanuel Johnson; REGNIER Charly; Julien Le Sommer; van Gennip; Ronan Fablet; Marie Drevillon; Yann DRILLET; Pierre Yves Le Traon

OceanBench: A Benchmark for Data-Driven Global Ocean Forecasting systems

Anass El Aouni, Quentin Gaudel, Juan Emmanuel Johnson, REGNIER Charly, Julien Le Sommer, van Gennip, Ronan Fablet, Marie Drevillon, Yann DRILLET, Pierre Yves Le Traon

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY-SA 4.0

Keywords: Machine Learning, Ocean Forecast, Benchmarking, Operational Oceanography.

TL;DR: We introduce OceanBench, a global benchmark with curated data and standardized evaluation tracks to advance reproducible, data-driven short-range ocean forecasting.

Abstract: Data-driven approaches, particularly those based on deep learning, are rapidly advancing Earth system modeling. However, their application to ocean forecasting remains limited despite the ocean's pivotal role in climate regulation and marine ecosystems. To address this gap, we present OceanBench, a benchmark designed to evaluate and accelerate global short-range (1–10 days) data-driven ocean forecasting. OceanBench is constructed from a curated dataset comprising first-guess trajectories, nowcasts, and atmospheric forcings from operational physical ocean models, typically unavailable in public datasets due to assimilation cycles. Matched observational data are also included, enabling realistic evaluation in an operational-like forecasting framework. The benchmark defines three complementary evaluation tracks: (i) Model-to-Reanalysis, where models are compared against the reanalysis dataset commonly used for training; (ii) Model-to-Analysis, assessing generalization to a higher-resolution physical analysis; and (iii) Model-to-Observations, Intercomparison and Validation (IV-TT) CLASS-4 evaluation against independent observational data. The first two tracks are further supported by process-oriented diagnostics to assess the dynamical consistency and physical plausibility of forecasts. OceanBench includes key ocean variables: sea surface height, temperature, salinity, and currents, along with standardized metrics grounded in physical oceanography. Baseline comparisons with operational systems and state-of-the-art deep learning models are provided. All data, code, and evaluation protocols are openly available at https://github.com/mercator-ocean/oceanbench, establishing OceanBench as a foundation for reproducible and rigorous research in data-driven ocean forecasting.

Croissant File: json

Dataset URL: https://github.com/mercator-ocean/oceanbench/blob/main/docs/input-datasets-for-oceanbench-challenger-evaluation.rst

Code URL: https://github.com/mercator-ocean/oceanbench

Supplementary Material: zip

Primary Area: AL/ML Datasets & Benchmarks for physics (e.g. climate, health, life sciences, physics, social sciences)

Submission Number: 2566

Loading