Keywords: Algorithmic fairness, Synthetic data generation, Bias mitigation, Tabular classification, Adversarial debiasing, Machine learning ethics, Demographic parity, Equal opportunity
TL;DR: We develop a synthetic data framework to systematically evaluate fairness-aware classification methods, achieving 97% bias reduction with only 4-6% accuracy loss.
Abstract: Machine learning classifiers often exhibit bias against protected demographic
groups when trained on imbalanced datasets. This work presents a comprehensive
framework for investigating fairness in tabular classification using fully
synthetic data. We generate controlled synthetic datasets with configurable bias
parameters and evaluate lightweight fairness mitigation strategies including
reweighting and adversarial debiasing. Our approach enables systematic comparison
of fairness-accuracy trade-offs across multiple baseline and proposed methods.
Results demonstrate that our proposed fairness-aware classifiers achieve improved
demographic parity (97% bias reduction) with minimal accuracy degradation (4-6%
cost). The synthetic data framework provides a reproducible and
privacy-preserving testbed for fairness research, enabling controlled
investigation of bias mitigation techniques without real-world data constraints.
Supplementary Material: zip
Submission Number: 268
Loading