Two-Stage Diffusion Model for 3D Medical Image Segmentation

Masaki Nishimura, Takaya Ueda, Eisuke Ito, Ikuko Nishikawa

Published: 2024, Last Modified: 28 Feb 2026IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Several generative models have been proposed for image segmentation. The latest diffusion models exhibit generation abilities and segmentation superior to those of previous models based on variational autoencoders or generative adversarial nets. However, diffusion models generally require large structures for the reverse diffusion process as well as considerable memory. For the segmentation of three-dimensional (3D) medical volume data, the original data are typically decomposed into multiple two-dimensional (2D) slices or smaller patches as the input for the segmentation model to reduce computational costs. However, this method can limit the receptive field to capture the structure of the original volume data. We propose a novel two-stage diffusion-based segmentation model that retains its original structure with reduced computational costs. In the first stage, coarse detection and segmentation are performed on the entire volume. In the second stage, finer segmentation is performed on each small patch. Furthermore, the prediction map obtained in the first stage is used as a given condition during the denoising process, thus global information is considered for fine segmentation. The proposed method is applied for brain tumor segmentation on the BraTS 2021 dataset and achieved an improved accuracy for multiple types of tumor regions. When focusing on the segmentation of smaller enhancing tumors (ET), the proposed method achieved Dice score of 88.0%, which is 1.6% higher than that of the baseline model, Diff-UNet. This performance signifies a substantial improvement over previous model, especially in the challenging task of brain tumor segmentation.

External IDs:dblp:conf/ijcnn/NishimuraUIN24