Keywords: Lung Nodule Detection, Transformer-based Segmentation, Deformable DETR, Focal Loss, Maximum Intensity Projection, Class Imbalance, SAM Fine-Tuning, Sparse CT Imaging, Anomaly Detection, Medical Imaging.
TL;DR: We leverage a two-stage transformer framework that reduces dependency on extensive annotated data by integrating deformable attention and self-supervised segmentation refinement, ensuring robust lung nodule detection in sparse CT scans
Abstract: Accurate segmentation of lung nodules in computed tomography (CT) scans is challenging due to extreme class imbalance, where nodules appear sparsely among healthy tissue. We introduce a novel two-stage approach for lung nodule segmentation, framing it as an anomaly detection problem. The method consists of two stages: Stage 1 employs a custom Detection Transformer architecture with deformable attention and focal loss to generate region proposals, addressing class imbalance and localizing sparse nodules. In Stage 2, the predicted bounding boxes are refined into segmentation masks using a fine-tuned variant of the Segment Anything Model (SAM). To address sparsity and enhance spatial context, a 5mm Maximum Intensity Projection is applied to improve differentiation between nodules, bronchioles, and vascular structures. The model achieves a stage-2 DiceC of 91.4\%, with stage-1 yielding an F1 score of 94.2\%, 95.2\% sensitivity, and 93.3\% precision on the LUNA16 dataset despite extreme sparsity, where only 5\% of slices contain a nodule, outperforming existing state-of-the-art methods. The model was additionally validated on a privately procured test dataset of 30 patients with significantly different characteristics, achieving a Dice coefficient of 78.3\% despite significant distribution drift, demonstrating strong generalization to clinical variability and establishing our approach as the new state-of-the-art for lung nodule segmentation.
Submission Type: Original Work
Submission Number: 2
Loading