DiffCNN: A collaborative framework of diffusion model and CNN for semi-supervised medical image segmentation

Shanshan Xu; Lixia Tian

DiffCNN: A collaborative framework of diffusion model and CNN for semi-supervised medical image segmentation

Shanshan Xu, Lixia Tian

Published: 01 Jan 2025, Last Modified: 24 Jul 2025Neural Networks 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The highly prevalent teacher-student architecture has demonstrated great success in semi-supervised medical image segmentation. Despite its excellent performance, the architecture still faces two challenges: 1) the optimization of the teacher subnet relies heavily on the student subnet, and this greatly limits the capability of the teacher subnet; 2) the commonly used CNN-based structure for the construction of the teacher and student subnets cannot deal well with noisy medical images. To address these challenges, we propose DiffCNN, a collaborative framework of diffusion model and CNN for semi-supervised medical image segmentation. Unlike classic approaches that use two subnets of the same structure, our proposed DiffCNN employs two subnets of quite different structures. Specifically, in addition to a CNN subnet, DiffCNN also employs a diffusion subnet to alleviate the influences of noises through learning the underlying distribution of the mask. Collaborative training of the diffusion and CNN subnets makes it possible for the two subnets to learn from each other and accordingly extract complementary information from the input images more effectively. Furthermore, adversarial learning is involved to further enhance the capability of the diffusion subnet through forcing the diffusion-based segmentations to access real masks. We evaluate the performance of the proposed DiffCNN on three datasets, and the results demonstrate the superior performance of the DiffCNN over the state-of-the-art semi-supervised segmentation methods.

Loading