Keywords: material science, foundation models, synthetic data generation
Abstract: Developments in deep learning have facilitated the automatic visual analysis of scientific data, driving forward exploratory research. However, these approaches depend on large amounts of expert-annotated data for effective training, which is difficult to come by in narrow application domains. In this work, we address the challenges that come with performing visual analysis of high-speed x-ray phase contrast images of the combustion of molten metal particles. In this case, manual annotations of thousands of complex frames is highly impractical. To address this, we propose a synthetic data generation framework that eliminates the need for large-scale manual labelling by generating image-annotation pairs for the task of image segmentation. We first train a denoising diffusion model with a small number of annotated samples to generate image-binary mask pairs. We use the predictions of a fine-tuned segmentation foundation model to create a multi-class semantic annotations for the synthetic dataset. We apply our framework on x-ray phase contrast videos of particle combustion. From 200 manually annotated frames, we generate 10,000 synthetic image-annotation pairs. We demonstrate that training semantic segmentation models with our generated synthetic data yields significant boost in performance.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 6
Loading