ChatGPT as a Text Simplification Tool to Remove Bias

Published: 01 Jan 2023, Last Modified: 20 May 2025CoRR 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The presence of specific linguistic signals particular to a certain sub-group of people can be picked up by language models during training. If the model begins to associate specific language with a distinct group, any decisions made based upon this language would hold a strong correlation to a decision based upon their protected characteristic, leading to possible discrimination. We explore a potential technique for bias mitigation in the form of simplification of text. The driving force of this idea is that simplifying text should standardise language between different sub-groups to one way of speaking while keeping the same meaning. The experiment shows promising results as the classifier accuracy for predicting the sensitive attribute drops by up to 17% for the simplified data.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview