Minimizing Data, Maximizing Performance: Generative Examples for Continual Task Learning

Published: 06 May 2025, Last Modified: 06 May 2025SynData4CVEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Synthetic Data, Continual Learning, Sample Selection, Robustness
TL;DR: This work proposes enhancing efficiency, generalization, and robustness in CL through substitution of synthetic images for training data, and a sample minimization strategy
Abstract: Synthetic data is emerging as a powerful tool in computer vision, offering advantages in privacy and security. As generative AI models advance, they enable the creation of large-scale, diverse datasets that eliminate concerns related to sensitive data sharing and costly data collection processes. However, fundamental questions arise: (1) can synthetic data replace natural data in a continual learning (CL) setting? How much synthetic data is sufficient to achieve a desired performance? How well is the network generalizable when trained on synthetic data. To address these questions, we propose a sample minimization strategy for CL that enhances efficiency, generalization, and robustness by selectively removing uninformative or redundant samples during the training phase. We apply this method in a sequence of tasks derived from the GenImage dataset. This setting allows us to compare the impact of training early tasks entirely on synthetic data to analyze how well they transfer knowledge for the subsequent tasks or for evaluation on natural images. Furthermore, our method allows us to investigate the impact of removing potentially incorrect, redundant, or harmful training samples. We aim to maximize CL efficiency by removing uninformative images and enhance robustness through both adversarial training and structured data removal. We experimentally study how the training order of synthetic and natural data, and what generative models are used, significantly impact CL performance maximization and the natural data minimization. Our findings provide key insights into how generative examples can be leveraged for adaptive and efficient CL in evolving environments.
Supplementary Material: zip
Submission Number: 75
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview