Comprehensive Evaluation of Quantitative Measurements From Automated Deep Segmentations of PSMA PET/CT Images

Obed Korshie Dzikunu, Amirhossein Toosi, Shadab Ahamed, Sara Harsini, François Bénard, Xiaoxiao Li, Arman Rahmim

Published: 01 Jan 2025, Last Modified: 06 Nov 2025IEEE Transactions on Radiation and Plasma Medical SciencesEveryoneRevisionsCC BY-SA 4.0
Abstract: This study performs a comprehensive evaluation of quantitative measurements as extracted from automated deep-learning-based segmentation methods, beyond traditional Dice Similarity Coefficient (DSC) assessments, focusing on six quantitative metrics, namely SUVmax, SUVmean, total lesion activity (TLA), tumor volume (TMTV), lesion count, and lesion spread. We analyzed 380 prostate-specific membrane antigen (PSMA) targeted 18FDCFPyL PET/CT scans of patients with biochemical recurrence of prostate cancer, training deep neural networks, U-Net, Attention U-Net and SegResNet with four loss functions: Dice Loss, Dice Cross Entropy, Dice Focal Loss, and our proposed L1 weighted Dice Focal Loss (L1DFL). Evaluations indicated that Attention U-Net paired with L1DFL achieved the strongest correlation with the ground truth (Lin’s concordance correlation = 0.90–0.99 for SUVmax and TLA), whereas models employing the Dice Loss and the other two compound losses, particularly with SegResNet, underperformed. Equivalence testing (TOST, α = 0.05, = 20%) confirmed high performance for SUV metrics, lesion count and TLA, with L1DFL yielding the best performance. By contrast, tumor volume and lesion spread exhibited greater variability. Bland-Altman, Coverage Probability, and Total Deviation Index analyses further highlighted that our proposed L1DFL minimizes variability in quantification of the ground truth clinical measures. However, differences in DSC across the various loss functions did not reach statistical significance based on p-value analysis. The code is publicly available at: https://github.com/ObedDzik/pca_segment.git.
Loading