Open-source federated learning across multi cloud environment

Published: 10 Jun 2025, Last Modified: 17 Jul 2025TerraBytes 2025 withoutproceedingsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: data discovery; data processing; geospatial;
Abstract: Hundreds of Petabyte of data is generated daily from Satellite, Weather and Earth models. Machine-Learning (ML) and Artificial Intelligence (AI) promises great advances in understanding the impact of climate change on Earth and discovering climate adaptation and mitigation solutions. The main challenge in advancing AI for Climate is rooted in the vast logical and physical distribution of climate-relevant data and information. The data is usually hosted in a openly accessible cloud environment. However, assembling an AI-ready datacube requires accessing individual platforms and applying spatial and temporal filters. We describe a distributed open-source data platform that aggregates data through a Spatio-Temporal Asset Catalog (STAC) for quick data discovery and to download only the data of interest. Various datasets are harmonized using the openEO framework that assembles the AI datacubes efficiently, enabling a speed-up in their generation.
Submission Number: 34
Loading