Keywords: Latent Diffusion, Synthetic data, Text-to-image generation, Satellite Imagery
Abstract: While satellite data is essential for applying computer vision to many real-world tasks, it remains expensive to acquire. Although other computer vision tasks have alleviated data procurement costs by augmenting training datasets with synthetic images from text-to-image models, such augmentation remains underdeveloped in the remote sensing domain. In this work, we propose an alternative approach for generating synthetic training data tailored to satellite imagery. To better understand the underlying problem, we begin by analyzing the impact of the target data distribution in comparison to the distributions used to train the text-to-image generation model. We find that data rarity is strongly correlated with the effectiveness of synthetic training data produced by Stable Diffusion fine-tuned on few-shot examples, suggesting that rarity can serve as a low-cost proxy for pre-evaluating the effectiveness of synthetic data generation. Notably, our analysis shows that Stable Diffusion struggles to produce useful training images for rare, out-of-distribution data. Building on this insight, we propose two modifications to the generation process tailored to satellite images: offset noise and leak-aligned noise. Both are designed to adjust the initial noise distribution and correct low-frequency characteristics. Our approaches enable improved training performance for classifiers trained on synthetic data, demonstrated on three satellite benchmarks.
Primary Area: generative models
Submission Number: 5860
Loading