Latent-Stability Gated SAM: Detecting Hallucinated Segmentations under Domain Shift

Published: 25 Mar 2026, Last Modified: 28 May 2026CVPR 2026 Workshop CogVL PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 1: Papers with IEEE/CVF Workshop Proceedings
Keywords: Foundation models, medical image segmentation, Segment Anything Model, reliability estimation, hallucination detection, latent stability, test-time inference, domain shift
TL;DR: A training-free stability monitoring framework that detects and corrects hallucinated SAM segmentations under domain shift.
Abstract: The Segment Anything Model (SAM) demonstrates strong zero-shot segmentation performance across diverse visual domains; however, its reliability degrades when applied outside its training distribution. In medical imaging, domain shifts and prompt inaccuracies frequently produce hallucinated segmentations—masks that appear visually plausible but correspond to incorrect anatomical structures while maintaining high confidence. Such silent failures present a major obstacle for deploying foundation models in safety-critical clinical environments. We introduce Latent-Stability Gated SAM (LSG-SAM), a training-free inference framework that detects and recovers from unreliable segmentation predictions through stability-based monitoring. The key idea is to evaluate the consistency of predicted masks under small perturbations applied to the latent feature representation of SAM. Reliable segmentations remain stable under perturbations, whereas brittle predictions exhibit large variations that signal potential hallucinations. A latent-stability gate measures this stability and selectively activates a prompt search recovery route that explores alternative prompts instead of refining a potentially incorrect mask. We evaluate LSG-SAM across three medical imaging modalities: BUSI ultrasound, JSRT chest X-ray, and Kvasir-SEG endoscopy. Results show that LSG-SAM improves segmentation robustness while reducing catastrophic hallucination failures compared to standard SAM inference. For example, on BUSI ultrasound, LSG-SAM improves Dice from 0.7746 to 0.7826, while reliability analysis shows that the stability gate selectively activates recovery for unstable predictions while preserving already-correct segmentations. These findings suggest that latent stability monitoring can act as a meta-cognitive reliability mechanism for foundation models, enabling them to evaluate and revise predictions during inference without retraining. More broadly, this work demonstrates how stability-based monitoring can improve the robustness and trustworthiness of prompt-based vision systems under domain shift.
Submission Number: 21
Loading