Abstract: In clinical practice, medical imaging techniques include 2D video-based examinations that capture sequential scans, and 3D volumetric imaging that forms a comprehensive 3D representation from a stack of 2D slices. The medical image sequences produced by the above techniques provide valuable spatio-temporal characteristics for analysis and segmentation, but the annotation of image sequences is extremely time-consuming and labor-intensive. To exploit the coherence and address the scarcity of labeled data, we propose a novel semi-supervised semantic segmentation framework for medical image sequences, which consists of a conditional network and a denoising network. Specifically, we embed a Sequential Feature Reconstruction module into both networks. This module reconstructs the target frame from contiguous frames and captures their shared visual features. Guided by the context-enhancing information from the conditioning network, the denoising network suppresses background noise via a Diffusion-based Noise Elimination module. Extensive experiments are conducted on 2D and 3D tasks, including cardiac segmentation, polyp segmentation, placenta vessel segmentation and abdomen multi-organ segmentation. The results show our method is superior to existing semi-supervised methods and exhibits advantages over fully-supervised medical image segmentation methods with only 1/2 labeled data, validating its effectiveness and generalization ability.
Loading