Keywords: Fairness, Generative models, GAN, Calibration
Abstract: Recently, there has been increased interest in fair generative models. In this work,
we conduct, for the first time, an in-depth study on fairness measurement, a
critical component in gauging progress on fair generative models. We make three
contributions. First, we conduct a study that reveals that the existing fairness
measurement framework has considerable measurement errors, even when highly
accurate sensitive attribute (SA) classifiers are used. These findings cast doubts
on previously reported fairness improvements. Second, to address this issue,
we propose CLassifier Error-Aware Measurement (CLEAM), a new framework
which uses a statistical model to account for inaccuracies in SA classifiers. Our
proposed CLEAM reduces measurement errors significantly, e.g., 4.98%→0.62%
for StyleGAN2 w.r.t. Gender. Additionally, CLEAM achieves this with minimal
additional overhead. Third, we utilize CLEAM to measure fairness in important
text-to-image generator and GANs, revealing considerable biases in these models
that raise concerns about their applications. Code and more resources: https:
//sutd-visual-computing-group.github.io/CLEAM/.
Supplementary Material: pdf
Submission Number: 959
Loading