Leveraging Zero-Shot Medical Segmentation Models for Data Annotation and Model Training in Lumbar Vertebrae Segmentation
Keywords: Zero-shot medical segmentation, TotalSegmentator, MedSAM-2, automated data annotation, UNet models, lumbar vertebrae segmentation
TL;DR: Zero-shot models (TotalSegmentator, MedSAM-2) were tested for lumbar vertebrae segmentation. MedSAM-2 excelled in binary tasks, TotalSegmentator in multi-label. UNet models trained on synthetic masks performed well.
Abstract: This study investigates the potential of zero-shot medical segmentation models, specifically
TotalSegmentator and MedSAM-2, for automated data annotation and subsequent training
of UNet models in lumbar vertebrae segmentation tasks. We evaluated the performance of
these models on both binary and multi-label segmentation tasks using a ground truth test
set. Three UNet models (two binary and one multi-label) were trained using masks generated by TotalSegmentator. The performance of MedSAM-2 , TotalSegmentator, and the
trained UNet models was assessed on a ground truth test set. Additionally, we evaluated the
binary UNet model on a synthetic test set generated by TotalSegmentator. Results show
that MedSAM-2 achieved the highest Dice scores in binary tasks (0.829 and 0.739), while
TotalSegmentator outperformed in the multi-label scenario (Dice: 0.760). The UNet mod-els trained on TotalSegmentator-generated masks demonstrated competitive performance,
particularly in binary tasks (Dice: 0.802 and 0.741). Notably, the binary UNet model
achieved a high Dice score of 0.898 on the synthetic test set, indicating strong consistency
with TotalSegmentator’s annotations. However, the performance gap between synthetic
and ground truth evaluations suggests potential domain adaptation challenges. These findings indicate that while zero-shot models can significantly reduce annotation burdens, their
synthetic labels may require refinement for optimal downstream model training, especially
in complex multi-class settings. This study highlights the context-dependent utility of
zero-shot models in medical image segmentation and underscores the importance of robust
validation strategies when leveraging synthetic annotations for model training.
Submission Number: 8
Loading