Optimizing Efficiency and Effectiveness in Sequential Prompt Strategy for SAM Using Reinforcement Learning

Yifei Huang, Chuyun Shen, Wenhao Li, Xiangfeng Wang, Bo Jin, Haibin Cai

Published: 01 Jan 2024, Last Modified: 01 Aug 2025MICCAI (8) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the rapidly advancing field of medical image analysis, Interactive Medical Image Segmentation (IMIS) plays a crucial role in augmenting diagnostic precision. Within the realm of IMIS, the Segment Anything Model (SAM), trained on natural images, demonstrates zero-shot capabilities when applied to medical images as the foundation model. Nevertheless, SAM has been observed to display considerable sensitivity to variations in interaction forms within interactive sequences, introducing substantial uncertainty into the interaction segmentation process. Consequently, the identification of optimal temporal prompt forms is essential for guiding clinicians in their utilization of SAM. Furthermore, determining the appropriate moment to terminate an interaction represents a delicate balance between efficiency and effectiveness. To provide sequential optimal prompt forms and best stopping time, we introduce an Adaptive Interaction and Early Stopping mechanism, named AIES. This mechanism models the IMIS process as a Markov Decision Process (MDP) and employs a Deep Q-network (DQN) with an adaptive penalty mechanism to optimize interaction forms and ascertain the optimal cessation point when implementing SAM. Upon evaluation using three public datasets, AIES identified an efficient and effective prompt strategy that significantly reduced interaction costs while achieving better segmentation accuracy than the rule-based method.