Abstract: Over 40\% of the global population lives within 100 kilometers of the coast, which contributes more than \$8 trillion annually to the global economy. Unfortunately, coastal ecosystems are increasingly vulnerable to more frequent and intense extreme weather events and rising sea levels. Coastal scientists use numerical models to simulate complex physical processes, but these models are often slow and expensive. In recent years, deep learning has become a promising alternative to reduce the cost of numerical models. However, progress has been hindered by the lack of a large-scale, high-resolution coastal simulation dataset to train and validate deep learning models. Existing studies often focus on relatively small datasets and simple processes. To fill this gap, we introduce a decade-long, high-resolution (<100m) coastal circulation modeling dataset on a real-world 3D mesh in southwest Florida with around 6 million cells. The dataset contains key oceanography variables (e.g., current velocities, free surface level, temperature, salinity) alongside external atmospheric and river forcings. We evaluated a customized Vision Transformer model that takes initial and boundary conditions and external forcings and predicts ocean variables at varying lead times. The dataset provides an opportunity to benchmark novel deep learning models for high-resolution coastal simulations (e.g., physics-informed machine learning, neural operator learning).
The code and dataset can be accessed at https://github.com/spatialdatasciencegroup/CoastalBench.
Lay Summary: Coastal areas are home to over 40% of the global population, but modeling coastal ocean dynamics remains computationally expensive and data-limited. We introduce a decade-long, high-resolution simulation dataset and develop a deep learning model that can emulate complex coastal circulation processes. This enables faster forecasting of coastal phenomena like storm surges and water quality changes, and contributes to the broader landscape of AI for scientific discovery, highlighting both the opportunities and responsibilities involved in applying machine learning to Earth system modeling.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/spatialdatasciencegroup/CoastalBench
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: Coastal Processes, Dataset, Benchmark
Submission Number: 13347
Loading