\section{Introduction}
AI-based image reconstruction is an increasingly integral component of clinical workflows. These approaches are designed to enhance the quality of noisy medical images such as low-dose X-rays or faster-sampled MRIs, ultimately generating new medical images by imputing patterns learned from the training datasets \cite{AHISHAKIYE2021118}. Notably, there are now over 80 FDA-cleared devices based on this approach \cite{Singh2025.03.13.25323924}, whose generated images are ultimately interpreted by clinicians. 

Traditionally, reconstruction model performance has been evaluated using pixel-level image metrics such as PSNR. However, these metrics provide an incomplete picture, as they do not reflect the impact of reconstructed images on subsequent clinical tasks. This gap raises a key unresolved question: \textit{How does AI-based reconstruction influence downstream clinical performance and, in particular, fairness?} The latter is especially important to assess given the risk of generative models in encoding biases \cite{saumure2025humor, ruggeri-nozza-2023-multi, luccioni2023stablebiasanalyzingsocietal, mehrabi2022surveybiasfairnessmachine}. While some smaller-scale studies have involved clinician review of AI-reconstructed images \citep{Feuerriegel2023-rt, Lee2024-ic}, this approach is not scalable, especially when investigating nuanced performance differences across subgroups. 

In this work, we assess the downstream implications of AI-based reconstruction through an evaluation framework that leverages reconstruction and classification/segmentation AI models applied in tandem. The framework provides a scalable approach to understand how reconstruction errors propagate, while also simulating a realistic clinical scenario as both reconstruction and diagnostic models are increasingly deployed in medical workflows. We apply this framework across three reconstruction approaches (U-Net, GAN, diffusion), two imaging domains (MRI, X-ray), and two tasks (classification and segmentation). We additionally propose and evaluate bias mitigation techniques tailored to reconstruction models. Our findings highlight differences in trends between image metrics and diagnostic accuracy, and the potential of reconstruction models to shift demographic biases.