Improving Semantic Segmentation Models through Synthetic Data Generation via Diffusion Models

ICLR 2024 Workshop DMLR Submission22 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: semantic image synthesis, semantic segmentation, diffusion models, anomaly localization, synthetic data generation
TL;DR: This paper addresses semantic image synthesis via Diffusion Models to enlarge an industrial semantic segmentation dataset.
Abstract: It is often difficult to obtain industrial data for semantic segmentation due to the costs and time required for annotation. However, deep learning models perform poorly when trained on small datasets. The current advances in generative models can be exploited to enhance existing datasets with synthetic data. In semantic segmentation, generating images is not sufficient, as the images need to fit with corresponding labels on a pixel level, which makes data generation more challenging. Our work exploits Diffusion Models, for generating synthetic data for industrial semantic segmentation. Our thorough experimentation reveals that the generated data can contribute to improving the performance of semantic segmentation models without altering their architecture when trained on a mix of real and synthetic data. Additionally, some experiments demonstrate the feasibility of achieving improvements by exclusively training the models on synthetic data. The code will be available upon acceptance.
Primary Subject Area: Domain specific data issues
Paper Type: Research paper: up to 8 pages
Participation Mode: Virtual
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 22
Loading