Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework

ACL ARR 2025 May Submission6408 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) are sensitive to subtle changes in prompt phrasing, complicating efforts to audit them reliably. Prior approaches often rely on arbitrary or ungrounded prompt variations, which may miss key linguistic and demographic factors in real-world usage. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for systematically generating and evaluating controlled, realistic prompt paraphrases based on linguistic structure and user demographics. AUGMENT ensures paraphrase quality through a combination of semantic, stylistic, and instruction-following criteria. In a case study on the BBQ dataset, we show that user-grounded paraphrasing leads to significant shifts in LLM performance and bias metrics across nine models. Our findings highlight the need for more representative and structured approaches to prompt variation in LLM auditing.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Ethics Bias and Fairness, Language Modeling, Resources and Evaluation
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 6408
Loading