DiffMix: Diffusion Model-Based Data Synthesis for Nuclei Segmentation and Classification in Imbalanced Pathology Image Datasets
Abstract: Nuclei segmentation and classification is an important process in pathological image analysis. Deep learning-based approaches contribute significantly to the enhanced accuracy of this task. However, these approaches suffer from an imbalanced nuclei data composition, which results in lower classification performance for rare nuclei class. In this study, we proposed a realistic data synthesis method using a diffusion model. We generated two types of virtual patches to enlarge the training data distribution, which balanced the nuclei class variance and increased the chance of investigating various nuclei. Subsequently, we used a semantic label-conditioned diffusion model to generate realistic and high-quality image samples. We demonstrated the efficacy of our method based on experimental results on two imbalanced nuclei datasets, improving the state-of-the-art networks. The experimental results suggest that the proposed method improves the classification performance of the rare type nuclei classification, while showing superior segmentation and classification performance in imbalanced pathology nuclei datasets.
Loading