Evaluating the Impact of Medical Image Reconstruction on Downstream AI Fairness and Performance
Keywords: Fairness, Image Reconstruction, GANs, Diffusion Models
Abstract: Reconstruction models designed to improve image quality from noisy or undersampled data, such as low-dose X-rays or accelerated MRI scans, are increasingly deployed in clinical workflows. However, these models are typically evaluated using pixel-level metrics like PSNR, leaving their impact on downstream diagnostic performance and fairness unclear. We introduce a scalable evaluation framework that simulates a realistic clinical workflow by chaining reconstruction with segmentation and classification models to address this gap. Using publicly available MRI and X-ray datasets, we demonstrate that conventional reconstruction metrics poorly track downstream performance, where diagnostic accuracy remains mainly stable even as reconstruction PSNR declines with increasing image noise. Conversely, fairness metrics exhibit greater variability, with reconstruction sometimes amplifying demographic biases, particularly regarding patient sex. However, the overall magnitude of this additional bias is modest compared to the inherent biases already present in diagnostic models. To explore potential bias mitigation, we adapt three established classification bias-reduction strategies to the reconstruction setting, but observe limited efficacy. Overall, our findings emphasize the importance of holistic performance and bias assessments throughout the entire medical imaging workflow, providing insights toward developing fairer and more effective AI systems in healthcare.
Primary Subject Area: Image Acquisition and Reconstruction
Secondary Subject Area: Fairness and Bias
Registration Requirement: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 37
Loading