Scaling Supervision for Free: Leveraging Universal Segmentation Models for Enhanced Medical Image Diagnosis
Keywords: medical image analysis, automated annotation, supervision scaling
Abstract: Deep learning-based medical image analysis has been constrained by the limited availability of large-scale annotated data. While recent advances in large language models have enabled scaling automatic extraction of diagnostic labels from reports, we propose that scaling other form of supervision could be an equally important yet unexplored direction. Inspired by the success of foundation models, we leverage modern universal segmentation model to scale anatomical segmentation as an additional supervision signal during training.
Through extensive experiments on three large-scale CT datasets totaling 58K+ volumes, we demonstrate that incorporating this ``free" anatomical supervision consistently improves the performance of various mainstream architectures (ResNet, ViT, and Swin Transformer) by up to 12.74\%, with particularly significant gains for Transformer-based models and anatomically-localized abnormalities, while maintaining inference efficiency as the segmentation branch is only used during training. This work opens up new direction for scaling in medical imaging and demonstrates how existing universal segmentation models can be repurposed to enhance diagnostic models at virtually no additional cost.
Primary Subject Area: Detection and Diagnosis
Secondary Subject Area: Application: Radiology
Paper Type: Methodological Development
Registration Requirement: Yes
Visa & Travel: Yes
Submission Number: 165
Loading