DAFT: Data-Aware Fine-Tuning of Foundation Models for Efficient and Effective Medical Image Segmentation

Published: 11 Oct 2024, Last Modified: 11 Oct 2024CVPR24 MedSAMonLaptopEveryoneRevisionsBibTeXCC BY-SA 4.0
Keywords: Data Aware, Fine Tuning, Efficient, Image Segmentation
TL;DR: We use a novel fine-tuning technique and several improvements to the inference environment to perform efficient and effective medical image segmentation.
Abstract: Efficient and effective medical image segmentation supports faster and better decision-making of medical experts. In this work, we propose data-aware fine-tuning (DAFT), a method for enabling efficient and effective inference with foundation models, and apply it to medical image segmentation tasks. Following concepts from meta-learning for algorithm selection and dynamic selection, DAFT aims to fine-tune several versions of a foundation model on subsets of all available data instead of fine-tuning just one larger model. Then, at inference time, we select which fine-tuned model to use for the prediction depending on the distribution of the input data. DAFT enables us to create more efficient and effective models for each subset than when creating one model for all data. In our implementation of DAFT for the "Segment Anything In Medical Images On Laptop" competition as part of the CVPR24 Workshop on "Foundation Models for Medical Vision", we use the EfficientViT architecture, knowledge distillation, and OpenVINO runtime to further improve the inference. Additionally, we optimized the efficiency of our method through a flood of improvements, including an optimized inference runtime, caching, optimizing the docker deployment container, and better inference code. DAFT improved the average dice similarity coefficient from 78.64% to 83.29% and the normalized surface distance from 80.58% to 85.59% compared to the baseline on the test data. Our final submission secured first place on the post-challenge leaderboard. Finally, and more importantly, we improved the average inference speed over the baseline by a factor of 6.5 (14.69 to 2.25 seconds) on the test set.
Submission Number: 5
Loading