Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan; Francesco Pinto; Adam Davies; Philip Torr

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies, Philip Torr

Published: 03 Jul 2024, Last Modified: 16 Jul 2024ICML 2024 FM-Wild Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Text-to-Image Generator, Generative Model, Synthetic Data, Robustness, Domain Generalization, Bias

TL;DR: We explore the use of Text-to-Image (T2I) generators for simulating arbitrary interventions on environmental factors, thereby augmenting the original training data to enhance the robustness of neural image classifiers.

Abstract: Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators like Stable Diffusion can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in Single Domain Generalization (SDG), finding that current T2I generators can indeed be used as a powerful interventional data augmentation mechanism, outperforming previously state-of-the-art data augmentation techniques across all datasets. More broadly, our work demonstrates the utility of generative foundation models in synthesizing interventional data that can be used to train more robust machine learning systems, facilitating the application of such technologies in new domains.

Submission Number: 60

Loading