Leveraging Zero-Shot Medical Segmentation Models for Data Annotation and Model Training in Lumbar Vertebrae Segmentation

31 Mar 2025 (modified: 12 Apr 2025)MIDL 2025 Short Papers SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Zero-shot medical segmentation, TotalSegmentator, MedSAM-2, automated data annotation, UNet models, lumbar vertebrae segmentation
TL;DR: Zero-shot models (TotalSegmentator, MedSAM-2) were tested for lumbar vertebrae segmentation. MedSAM-2 excelled in binary tasks, TotalSegmentator in multi-label. UNet models trained on synthetic masks performed well.
Abstract: This study investigates the potential of zero-shot medical segmentation models, specifically TotalSegmentator and MedSAM-2, for automated data annotation and subsequent training of UNet models in lumbar vertebrae segmentation tasks. We evaluated the performance of these models on both binary and multi-label segmentation tasks using a ground truth test set. Three UNet models (two binary and one multi-label) were trained using masks generated by TotalSegmentator. The performance of MedSAM-2 , TotalSegmentator, and the trained UNet models was assessed on a ground truth test set. Additionally, we evaluated the binary UNet model on a synthetic test set generated by TotalSegmentator. Results show that MedSAM-2 achieved the highest Dice scores in binary tasks (0.829 and 0.739), while TotalSegmentator outperformed in the multi-label scenario (Dice: 0.760). The UNet mod-els trained on TotalSegmentator-generated masks demonstrated competitive performance, particularly in binary tasks (Dice: 0.802 and 0.741). Notably, the binary UNet model achieved a high Dice score of 0.898 on the synthetic test set, indicating strong consistency with TotalSegmentator’s annotations. However, the performance gap between synthetic and ground truth evaluations suggests potential domain adaptation challenges. These findings indicate that while zero-shot models can significantly reduce annotation burdens, their synthetic labels may require refinement for optimal downstream model training, especially in complex multi-class settings. This study highlights the context-dependent utility of zero-shot models in medical image segmentation and underscores the importance of robust validation strategies when leveraging synthetic annotations for model training.
Submission Number: 8
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview