FairSAM: Fair Classification on Corrupted Data Through Sharpness-Aware Minimization

TMLR Paper7559 Authors

18 Feb 2026 (modified: 24 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Image classification models trained on clean data often suffer from significant performance degradation when exposed to corrupted testing or deployment data, such as images with impulse noise, Gaussian noise, or environmental noise. This degradation not only impacts overall performance but also disproportionately affects various demographic subgroups, raising critical algorithmic bias concerns. Although robust learning algorithms such as Sharpness-Aware Minimization improve overall model robustness and generalization, they do not address biased performance degradation across demographic subgroups. Existing fairness-aware machine learning methods aim to reduce performance disparities but struggle to maintain robust and equitable accuracy across demographic subgroups when faced with data corruption. This reveals an inherent tension between robustness and fairness when dealing with corrupted data. To address these challenges, we introduce a newly-designed metric to assess performance degradation across subgroups under data corruption. We propose FairSAM, a framework that integrates Fairness-oriented strategies into SAM to deliver equalized performance across demographic groups under corrupted conditions. Our experiments on multiple real-world datasets and various predictive tasks show that FairSAM reconciles robustness and fairness. The framework yields a structured solution for fair and robust image classification in the presence of data corruption.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We have substantially revised the manuscript to address all concerns raised. The major changes include: **Expanded Scope and Generalizability:** - Added CheXpert medical imaging dataset to demonstrate applicability beyond facial recognition tasks - Implemented DINOv3 backbone architecture to evaluate performance with modern foundation models - Added experiments with asymmetric noise levels across subgroups to address realistic corruption scenarios **Strengthened Theoretical Framework:** - Extended our fairness metric formulation to multi-class and multi-sensitive attribute settings - Added formal definitions for handling intersectional demographic groups **Enhanced Experimental Evaluation:** - Added Group Distributionally Robust Optimization (GroupDRO) and Momentum-SAM (MSAM) as additional competitive baselines - Provided detailed descriptions and proper citations for all baseline methods **Improved Clarity:** - Rewrote Sections 4.2 and 4.3 for better clarity and logical flow - Fixed all notation inconsistencies **Code:** - Added code repository link to ensure reproducibility We believe these revisions substantially strengthen the manuscript.
Assigned Action Editor: ~Qi_CHEN6
Submission Number: 7559
Loading