ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs

Pengrui Han; Rafal Dariusz Kocielnik; Adhithya Prakash Saravanan; Roy Luoyao Jiang; Or Sharir; Anima Anandkumar

ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs

Pengrui Han, Rafal Dariusz Kocielnik, Adhithya Prakash Saravanan, Roy Luoyao Jiang, Or Sharir, Anima Anandkumar

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Data, Evaluation, Societal implications, Safety

Keywords: large language models, social bias, synthetic data generation, debiasing

TL;DR: This work proposes novel strategies using ChatGPT to create synthetic training data aimed at efficiently debiasing Large Language Models (LLMs)

Abstract: Large Language models (LLMs), while powerful, exhibit harmful social biases. Debiasing is often challenging due to computational costs, data constraints, and potential degradation of multi-task language capabilities. This work introduces a novel approach utilizing ChatGPT to generate synthetic training data, aiming to enhance the debiasing of LLMs. We propose two strategies: Targeted Prompting, which provides effective debiasing for known biases but necessitates prior specification of bias in question; and General Prompting, which, while slightly less effective, offers debiasing across various categories. We leverage resource-efficient LLM debiasing using adapter tuning and compare the effectiveness of our synthetic data to existing debiasing datasets. Our results reveal that: (1) ChatGPT can efficiently produce high-quality training data for debiasing other LLMs; (2) data produced via our approach surpasses existing datasets in debiasing performance while also preserving internal knowledge of a pre-trained LLM; and (3) synthetic data exhibits generalizability across categories, effectively mitigating various biases, including intersectional ones. These findings underscore the potential of synthetic data in advancing the fairness of LLMs with minimal retraining cost.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 1321

Loading