MedficientSAM: A Robust Medical Segmentation Model with Optimized Inference Pipeline for Limited Clinical Settings

Bao-Hiep Le; Dang-Khoa Nguyen-Vu; Trong-Hieu Nguyen-Mau; Hai-Dang Nguyen; Minh-Triet Tran

MedficientSAM: A Robust Medical Segmentation Model with Optimized Inference Pipeline for Limited Clinical Settings

Bao-Hiep Le, Dang-Khoa Nguyen-Vu, Trong-Hieu Nguyen-Mau, Hai-Dang Nguyen, Minh-Triet Tran

Published: 11 Oct 2024, Last Modified: 11 Oct 2024CVPR24 MedSAMonLaptopEveryoneRevisionsBibTeXCC BY-SA 4.0

Keywords: Medical image segmentation, Distillation, Embeddings Caching, C++ Implementation, Edge AI

TL;DR: A Robust Medical Segmentation Model with Optimized Inference Pipeline for Limited Clinical Settings

Abstract: Medical image segmentation plays a crucial role in clinical practice, aiding in identifying tumors, delineating organs, and monitoring disease progression. The advent of the Segment Anything Model (SAM) has enabled the development of universal medical image segmentation models that generalize across different modalities. However, the accessibility of such deep learning models in clinical settings is still limited by the reliance on powerful computing devices. In this paper, we propose MedficientSAM, which adopts the EfficientViT model to replace the heavy image encoder in SAM and then distills the knowledge from the MedSAM model on the challenge’s training set. To further improve inference time, we re-implement the inference pipeline in the C++ programming language, optimizing the runtime on edge devices. MedficientSAM outperforms MedSAM in both accuracy and efficiency, achieving average DSC and NSD scores of 0.8642 and 0.8795, respectively, on the public validation set. The average inference time is 1.0083 seconds for 2D images and 8.9585 seconds for 3D images. Our code and models are publicly available at https://github.com/hieplpvip/medficientsam.

Submission Number: 16

Loading