Fine-Tuning LLMs with noisy data for political argument generation

Published: 13 Dec 2024, Last Modified: 23 Feb 2025Good-DataEveryoneRevisionsBibTeXCC BY 4.0
Student Lead Author Indication: No
Keywords: hate speech, discussion quality, reddit, twitter
TL;DR: We examine the effect of restricting the dataset vs. prompting to get models to unlearn undesirable traits while maintaining content relevance and tonal accuracy.
Abstract: The incivility in social media discourse complicates deploying automated text generation models for politically sensitive content. Fine-tuning and prompting strategies are critical but underexplored solutions to mitigate toxicity in such contexts. This study investigates fine-tuning and prompting effects on GPT-3.5 Turbo using subsets of the CLAPTON dataset of political discussion posts, comprising Twitter and Reddit data labeled for their justification, reciprocity, and incivility. Fine-tuned models on Reddit data scored highest on discussion quality, while combined noisy data led to persistent toxicity. Prompting strategies reduced specific toxic traits, like personal attacks, but had limited broader impact. Findings emphasize that high-quality data and well-crafted prompts are essential for reducing incivility and improving rhetorical quality in automated political discourse generation.
Submission Number: 42
Loading