Keywords: fairness, algorithmic bias, ethnic bias, mri, breast cancer, segmentation, medical imaging, label bias, mama mia
TL;DR: This fairness audit of a breast cancer segmentation dataset revealed intrinsic bias against younger patients and certain ethnic subgroups and demonstrates how combining data from multiple hospitals can hide severe, site-specific disparities.
Abstract: Deep learning models aim to improve diagnostic workflows, but fairness evaluation remains underexplored beyond classification, e.g., in image segmentation. Unaddressed segmentation bias can lead to disparities in the quality of care for certain populations, potentially compounded across clinical decision points and amplified through iterative model development. Here, we audit the fairness of the automated segmentation labels provided in the breast cancer tumor segmentation dataset MAMA-MIA. We evaluate automated segmentation quality across age, ethnicity, and data source. Our analysis reveals an intrinsic age-related bias against younger patients that continues to persist even after controlling for confounding factors, such as data source. We hypothesize that this bias may be linked to physiological factors, a known challenge for both radiologists and automated systems. Finally, we show how aggregating data from multiple data sources influences site-specific ethnic biases, underscoring the necessity of investigating data at a granular level.
Submission Number: 23
Loading