FineDeb: A Debiased Finetuning Approach for Language Models

Anonymous

FineDeb: A Debiased Finetuning Approach for Language Models

Anonymous

05 Jun 2022 (modified: 05 May 2023)ACL ARR 2022 June Blind SubmissionReaders: Everyone

Abstract: As language models are increasing included in human-facing machine learning tools, bias against demographic subgroups has gained attention. We consider the problem of debiasing in language models. Rather than modifying a model's already learned representations, we focus on modifying them during model training itself. We propose a two-phase methodology (FineDeb) that starts with contextual debiasing of embeddings learned by the language models during training, then finetunes the model on the original language modelling objective. We apply our method to debias for demographics with multiple classes, demonstrating its effectiveness through extensive experiments and comparing with state of the art techniques, and on three metrics.

Paper Type: short

0 Replies

Loading