EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision

Published: 01 Jan 2025, Last Modified: 25 Sept 2025WACV (Workshops) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper presents EarthView, a comprehensive dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks. The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic. Our dataset provides a wide spectrum of image data with varying resolutions, harnessed from different sensors and organized coherently into an accessible HuggingFace dataset1 1Available at https://huggingface.co/datasets/satellogic/EarthView in parquet format. This data spans five years, from 2017 to 2022. Accompanying the dataset, we introduce Earth-MAE, a tailored Masked Autoencoder, developed to tackle the distinct challenges of remote sensing data. Trained in a self-supervised fashion, EarthMAE effectively processes different data modalities such as hyperspectral, multispectral, topographical data, segmentation maps, and temporal structure. This model helps us show that pre-training on Satellogic data improves performance on downstream tasks. While there is still a gap to fill in MAE for heterogeneous data, we regard this innovative combination of an expansive, diverse dataset and a versatile model adapted for self-supervised learning as a stride forward in deep learning for Earth monitoring.
Loading