Contrastive Learning for Fair RepresentationsDownload PDF

Anonymous

17 Dec 2021 (modified: 05 May 2023)ACL ARR 2021 December Blind SubmissionReaders: Everyone
Abstract: Trained classification models can unintentionally lead to biased representations and predictions, which can reinforce societal preconceptions and stereotypes. Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and fragile to optimise. Here, we propose a method for mitigating bias in classifier training by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations, while instances sharing a protected attribute are forced further apart. In such a way our method learns representations that capture the task label in focused regions, while ensuring the protected attribute has diverse spread, and thus has limited impact on prediction and thereby results in fairer models. Extensive experimental results on three tasks show that: our method achieves fairer representations larger bias reduction than competitive baselines; it does so without sacrificing main task performance; and it generalizes across modalities and binary- and multi-class classification tasks, being conceptually simple and agnostic to network architecture, and incurring minimal additional compute cost.
Paper Type: long
Consent To Share Data: yes
0 Replies

Loading