everyone
since 18 Jun 2025">EveryoneRevisionsBibTeXCC BY 4.0
Open-source pre-trained models hold great potential for diverse applications, but their utility declines when their training data is unavailable. Data-Free Image Synthesis (DFIS) aims to generate images that approximate the learned data distribution of a pre-trained model without accessing the original data. However, existing DFIS methods produce samples that deviate from the training data distribution due to the lack of prior knowledge about natural images. To overcome this limitation, we propose DDIS, the first Diffusion-assisted Data-free Image Synthesis method that leverages a text-to-image diffusion model as a powerful image prior, improving synthetic image quality. DDIS extracts knowledge about the learned distribution from the given model and uses it to guide the diffusion model, enabling the generation of images that accurately align with the training data distribution. To achieve this, we introduce Domain Alignment Guidance (DAG) that aligns the synthetic data domain with the training data domain during the diffusion sampling process. Furthermore, we optimize a single Class Alignment Token (CAT) embedding to effectively capture class-specific attributes in the training dataset. Experiments on PACS and ImageNet demonstrate that DDIS outperforms prior DFIS methods by generating samples that better reflect the training data distribution, achieving SOTA performance in data-free applications.
Many powerful AI models are available, but using them for new tasks often requires their original training data, which is frequently unavailable due to privacy or copyright issues. Data-Free Image Synthesis (DFIS) offers a solution by creating synthetic images that resemble the original training data. However, current DFIS methods struggle because they generate images without a basic understanding of what natural images look like, leading to artificial-looking results that aren't truly like the original data. Our proposed method, called DDIS (Diffusion-assisted Data-free Image Synthesis), solves this by being the first to use a Text-to-Image (T2I) diffusion model, like Stable Diffusion. These T2I models already know a lot about natural images, which helps DDIS synthesize much more realistic images. DDIS operates through two key mechanisms: Firstly, Domain Alignment Guidance (DAG) helps the synthetic images look like they belong to the same "style" of images as the original training data. Secondly, Class Alignment Token (CAT) helps DDIS capture specific details for each category of image. By combining these techniques, DDIS creates synthetic images that are much closer to the real training data. This allows us to use pre-trained models effectively for tasks like knowledge distillation (transferring knowledge from a large model to a smaller one) or model pruning (making models more efficient), even when the original training data isn't available.