Synthetic Multimodal Data Generation and Training Optimization For Computed Tomography Cardiac Imaging Applications
Keywords: Generative Model, Multi-modal data, Computed Tomography, Cardiac Imaging, 3D Semantic Image Segmentation, Latent Diffusion Model, Self-Supervised Learning
TL;DR: Synthetic Multimodal Data Generation and Training Optimization For Computed Tomography Cardiac Imaging Applications
Abstract: Computed Tomography (CT) cardiac imaging is among the most complex visualization techniques within CT organ imaging procedures, primarily due to the dynamic nature of human hearts that are constantly working and pumping blood. To accurately capture the organs, CT scanners must perform fast scans to produce a "snapshot" of a human heart, yet their temporal resolution remains limited by CT systems' mechanical constraints. Recently, Generative AI has gained considerable attention, with extensive research exploring its potential to generate detailed synthetic images. In medical imaging, these techniques potentially offer a promising solution to the scarcity of CT cardiac data stemming from the aforementioned challenges. While these synthetic images appear highly realistic, an important question arises: Can they effectively support downstream tasks, such as semantic image segmentation? In this paper, we introduce a novel latent diffusion model as a generative model for 3D CT cardiac imaging, capable of producing multi-modal data including synthetic CT cardiac images alongside corresponding heart sub-structures. These multi-modal synthetic data are utilized in both the pre-training phase (via Self-Supervised Learning) and the fine-tuning phase (via Supervised Learning). Through extensive experimentation, we demonstrate that the synthetic data generated by our generative model significantly enhances 3D CT cardiac image segmentation performance, contributing to more accurate and robust diagnoses.
Submission Number: 79
Loading