Keywords: Fairness in Machine Learning, Equitable Deep Learning, Fairness Error Bound
TL;DR: This work theoretically analyzes the impact of disease prevalence and data distribution on fairness in medical AI, deriving bounds and providing empirical validation on diverse datasets to understand and mitigate unfairness.
Abstract: Fairness in machine learning is paramount to human society because machine learning systems increasingly influence various aspects of our daily lives, particularly in consequence-critical tasks such as medical diagnosis. Deep learning models for medical diagnosis often exhibit biased performance across diverse demographic groups. Theoretical analyses to understand unfairness in AI-based medical diagnosis systems are still lacking. This work presents a comprehensive theoretical analysis of the impact of disease prevalence and data distributions on the fairness guarantees of deep learning models for medical diagnosis. We formalize the fairness problem, introduce assumptions, and derive fairness error bounds, algorithmic complexity, generalization bounds, convergence rates, and group-specific risk bounds. Our analysis reveals that fairness guarantees are significantly influenced by the differences in disease prevalence rates and data distributions across demographic groups. We prove that considering fairness criteria can lead to better performance than standard supervised learning. Empirical results on diverse datasets, including FairVision, CheXpert, HAM10000 and FairFace, corroborate our theoretical findings, demonstrating the impact of disease prevalence and feature distribution disparities on the equitable performance of deep learning models for tasks such as glaucoma, diabetic retinopathy, age-related macular degeneration, and pleural effusion detection. The code for analysis is publicly available via \url{https://github.com/anonymous2research/fairness_guarantees}.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2870
Loading