Dataset Distillation for Eurosat

Julius Lautz; Daniel Leal; Linus Scheibenreif; Damian Borth; Michael Mommert

Dataset Distillation for Eurosat

Julius Lautz, Daniel Leal, Linus Scheibenreif, Damian Borth, Michael Mommert

Published: 01 Jan 2023, Last Modified: 14 Nov 2024IGARSS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In supervised learning, which is commonly used in Remote Sensing applications, the performance of a model trained on a larger dataset is generally better than or equal to a model trained on a smaller dataset. Dataset distillation is a method that extracts the discriminative features from a larger dataset to a smaller one. By doing so, the important characteristics of the original dataset that are critical for learning can be isolated. This has implications regarding computational efficiency and understanding underlying representation learning dynamics. These implications are of particular interest in the context of remote sensing, where large amounts of data are being generated and processed every day. We use dataset distillation across multiple network architectures on the RGB bands of the EuroSAT dataset to test how it would behave in a Remote Sensing scenario with real-world data. Our distilled dataset leads to a consistent out-performance of 5%-10% compared to random sampling for downstream classification tasks.

Loading