From Single-Round to Sequential: Building Stateful Interactive Medical Image Segmentation with SegVol and GRU Corrector

07 Jun 2025 (modified: 03 Nov 2025)CVPR 2025 Workshop MedSegFM SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Interactive Medical Image Segmentation; Sequential State Modeling; Uncertainty-Driven Refinement; GRU-Based Correction
TL;DR: We introduce a stateful interactive segmentation framework using SegVol and GRU Corrector, enabling efficient and precise multi-round refinement for medical imaging with fewer user interactions.
Abstract: Medical image segmentation has advanced considerably with foundational models like the Segment Anything Model (SAM) and its medical variants, yet real-world clinical deployment remains constrained by heterogeneous imaging protocols, limited data generalization, and the inefficiency of manual interaction. While recent SAM-based frameworks (e.g., SAM2, MedSAM2) introduce memory-aware mechanisms, they still rely on dense re-encoding and lack targeted correction strategies. We propose “From Single-Round to Sequential: Building Stateful Interactive Segmentation with SegVol and GRU Corrector”, a lightweight framework that reformulates interactive segmentation as a sequential refinement process guided by uncertainty and error heuristics. Specifically, we design: (1) a GRU-based temporal module to encode interaction history and enable stateful correction, and (2) an uncertainty-driven region adaptation scheme that selectively focuses refinement on ambiguous or mis-segmented areas, reducing redundant computation while improving correction efficiency. On validation data, our framework achieves a progressive Dice coefficient improvement from 0.661 (single-box prompt) to 0.671 after three refinement rounds, showing a 1.5\% absolute gain with diminishing returns in later interactions. These results highlight that uncertainty-guided, memory-efficient refinement offers a promising direction for practical interactive medical segmentation.
Submission Number: 14
Loading