Abstract: Few-shot semantic segmentation has considerable potential for low-data scenarios, especially for medical images that require expert-level dense annotations. Existing few-shot medical image segmentation methods strive to deal with the task by means of prototype learning. However, this scheme relies on support prototypes to guide the segmentation of query images, ignoring the rich anatomical prior knowledge in medical images, which hinders effective feature enhancement for medical images. In this paper, we propose an anatomical prior guided spatial contrastive learning, called APSCL, which exploits anatomical prior knowledge derived from medical images to construct contrastive learning from a spatial perspective for few-shot medical image segmentation. The new framework forces the model to learn the features in line with the embedded anatomical representations. Besides, to fully exploit the guidance information of the support samples, we design a mutual guidance decoder to predict the label of each pixel in the query image. Furthermore, our APSCL can be trained end-to-end in the form of episodic training. Comprehensive experiments on three challenging medical image datasets, i.e., CHAOS-T2, MS-CMRSeg, and Synapse, prove that our method significantly surpasses state-of-the-art few-shot medical segmentation methods, with a mean improvement of 3.61%, 2.30%, and 6.38% on the Dice score, respectively.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Content] Media Interpretation, [Experience] Multimedia Applications
Relevance To Conference: This work contributes to multimedia/multimodal processing by exploring and proposing a novel method that effectively handles multimedia or multimodal data (CT and MRI). Specifically, our research makes significant contributions to the field of multimedia/multimodal processing in the following ways:
Firstly, we propose an anatomical prior guided spatial contrastive learning, called APSCL, which exploits anatomical knowledge derived from medical images to construct contrastive learning from a spatial perspective for few-shot medical image segmentation.
Secondly, we design a mutual guidance decoder to predict the label of each pixel in the query image to fully exploit the guidance information of the support samples.
Additionally, our APSCL can be trained end-to-end in the form of episodic training. Comprehensive experiments on three challenging medical image datasets, i.e., CHAOS-T2, MS-CMRSeg, and Synapse, prove that our method significantly surpasses state-of-the-art few-shot medical segmentation methods, with an average improvement of 3.61%, 2.30%, and 6.38% in terms of the Dice score, respectively.
In summary, our work targets few-shot medical image segmentation within the field of multimedia/multimodal processing, providing an effective solution to the problem of multimodal medical image segmentation in data-scarce scenarios.
Supplementary Material: zip
Submission Number: 1000
Loading