BioSAM-3: Segment Anything in Medical Images with Semantic Concepts
Keywords: Medical Image Segmentation, Medical Imaging, Promptable Segmentation
TL;DR: We propose BioSAM-3, a text promptable model for medical image and video segmentation.
Abstract: Medical image segmentation serves as a cornerstone for biomedical research and clinical analysis. However, existing methods often fail to generalize across institutions or modalities, requiring extensive adaptation for every new deployment. In this work, we propose BioSAM-3, a text promptable model for medical image and video segmentation. By adapting the Segment Anything Model (SAM) 3 architecture to multi-modal medical data paired with semantic concept labels, BioSAM-3 enables robust zero-shot and prompt-driven segmentation across diverse organs, modalities, and clinical settings. Moreover, we integrate BioSAM-3 into an agentic framework with Large Vision Language Models (LVLMs), enabling automatic prompt generation and iterative refinement for more accurate and reliable clinical segmentation. Extensive experiments demonstrate that BioSAM-3 delivers superior generalizability and strong cross-domain performance, highlighting its potential as a universal segmentation interface for real-world clinical workflows.
Primary Subject Area: Segmentation
Secondary Subject Area: Foundation Models
Registration Requirement: Yes
Visa & Travel: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 348
Loading