Efficient and Robust Medical Image Segmentation Using Lightweight ViT-Tiny based SAM and Model Quantization

Lei Yu

Efficient and Robust Medical Image Segmentation Using Lightweight ViT-Tiny based SAM and Model Quantization

Lei Yu

31 May 2024 (modified: 11 Oct 2024)Submitted to CVPR24 MedSAMonLaptopEveryoneRevisionsBibTeXCC BY-SA 4.0

Keywords: iT-Tiny based SAM · Data Augmentation · Modality Im- balance · Post-Training Model Quantization

Abstract: This paper proposes a lightweight SAM-based medical image segmentation model utilizing ViT-Tiny, designed to efficiently address the challenges of medical image segmentation in clinical practice. By replacing SAM's image encoder with ViT-Tiny and retaining its lightweight prompt encoder and mask decoder architecture, we significantly reduce computational complexity while maintaining high segmentation performance. We employ a comprehensive data augmentation strategy, including window width and level adjustments, random rotations, contrast adjustments, and geometric transformations such as translation, scaling, random cropping, and affine transformations. These techniques enhance the model's robustness and generalization ability. To address the class imbalance in the dataset, we implement random sampling, oversampling, and modality weighting strategies, ensuring the model learns features from different modalities in a balanced manner. To improve inference speed on CPUs, we apply post-training model quantization techniques, making our model feasible for real-world deployment without compromising performance. Our model demonstrates outstanding performance across various evaluation metrics, and results on the validation dataset prove its effectiveness and reliability in medical image segmentation tasks. In summary, our approach achieves a well-balanced trade-off between segmentation accuracy, generalization, and computational efficiency, providing a robust and efficient solution for medical image segmentation. This research not only helps improve clinical diagnostic efficiency but also offers valuable insights for future developments.

Submission Number: 15

Loading