On Tackling Domain Shift in Breast MRI Using Only Publicly-Available Data: Reproducible Breast Cancer Segmentation and pCR Prediction
Abstract: Breast magnetic resonance imaging (MRI) offers high diagnostic capability for breast cancer, but comparison of machine learning (ML) methods has historically been limited by the lack of annotated public data. The MAMA-MIA challenge addresses this by providing voxel-level tumor segmentations for n = 1506 pre-treatment DCE-MRI cases. In this work, we present publicly reproducible ML models for two breast cancer-related tasks: breast tumor segmentation and prediction of pathological complete response (pCR) following neoadjuvant chemotherapy. For segmentation, we use a self-supervised pretraining strategy on n = 4799 public DCE-MRI volumes combined with a residual encoder architecture, achieving substantial performance above the baseline and strong fairness across demographic subgroups. For pCR classification, radiomics features extracted from segmented lesions are used with a calibrated gradient boosting classifier, emphasizing fairness and domain generalization. Our models demonstrate that high-performance and fair breast MRI analysis can be achieved using only public data, showing the potential for equitable and transparent ML oncological models.
External IDs:dblp:conf/miccai/KacheleBEM25
Loading