KD-MedSAM: Lightweight Knowledge Distillation of Segment Anything Model for Multi-modality Medical Image Segmentation

Published: 2025, Last Modified: 25 Jan 2026ICIC (28) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The effectiveness of the Segment Anything Model (SAM) in image segmentation has been widely recognized. Although SAM performs excellently in natural image segmentation scenario, fine-tuning is often needed for medical image segmentation due to data and task specificity. Although fine-tuning improves task-specific performance in medical imaging, the extensive parameter size results in significant computational overhead, becoming a major obstacle for fine-tuning. Additionally, SAM relies on manual prompts, which can be costly in medical scenarios. To address these challenges, we introduce KD-MedSAM, a lightweight knowledge distillation framework for multi-modal medical image segmentation based on the Segment Anything Model. This approach combines self-supervised learning with knowledge distillation. Using Masked Image Modeling (MIM), KD-MedSAM distills the encoder knowledge from SAM into a lightweight encoder, eliminating the need for labeled data and reducing computational costs. We then fine-tune the lightweight encoder and connect a decoder to it via skip connections at multiple resolutions within the segmentation network. This framework removes the need for manual prompts and achieves end-to-end segmentation. Our method outperforms other SAM knowledge distillation methods, achieving the best results on four multi-modal medical image datasets.
Loading