GAMT: A Geometry-Aware, Multi-view, Training-free Segmentation Framework for Foundation Models in Medical Imaging

Sun Jo; Ahjin Choi; Je Hyeong Hong

GAMT: A Geometry-Aware, Multi-view, Training-free Segmentation Framework for Foundation Models in Medical Imaging

Sun Jo, Ahjin Choi, Je Hyeong Hong

07 Jun 2025 (modified: 09 Jun 2025)CVPR 2025 Workshop MedSegFM SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Medical image segmentation, foundation-model, trainingfree.

TL;DR: A Geometry-Aware, Multi-view, Training-free Segmentation Framework for Foundation Models in Medical Imaging

Abstract: Medical image segmentation is a critical task in clinical diagnostics and biomedical research. While deep learning has significantly advanced the field, most existing methods rely on task-specific models that require extensive manual annotations for training or adaptation. Vision foundation models, such as the Segment Anything Model (SAM), offer a promising alternative with their universal segmentation capabilities. However, their application to 3D medical imaging remains limited, especially in zero-shot scenarios involving previously unseen anatomical structures. In this work, we introduce GAMT, a zero-shot, training-free framework that repurposes powerful 2D foundation segmentation models (e.g., SAM, SAM-Med2D) for universal 3D biomedical image segmentation. To bridge the dimensionality gap, GAMT performs slice-wise inference along three orthogonal anatomical planes (axial, coronal, and sagittal) and subsequently fuses the predictions to construct a coherent 3D segmentation mask. Crucially, without any model training or fine-tuning, this framework achieves average Dice Similarity Coefficient (DSC) and Normalized Surface Dice (NSD) scores of dsc-score and nsd-scores, respectively— without requiring model training or fine-tuning. Our code and results are publicly available at https://github.com/SpatialAILab.

Submission Number: 15

Loading