TempParaphraser: "Heating Up" Text to Evade AI-Text Detection through Paraphrasing

TempParaphraser: "Heating Up" Text to Evade AI-Text Detection through Paraphrasing

ACL ARR 2025 February Submission3860 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The widespread adoption of large language models (LLMs) has increased the need for reliable AI-text detection. While current detectors perform well on benchmark datasets, we identify a critical vulnerability: increasing the temperature parameter during inference significantly reduces detection accuracy. Based on this weakness, we propose TempParaphraser, a simple yet effective paraphrasing framework that simulates high-temperature sampling effects through multiple normal-temperature generations, effectively evading detection. Experiments show that TempParaphraser reduces detector accuracy by an average of 97.3\% while preserving high text quality. We also demonstrate that training on TempParaphraser-augmented data improves detector robustness. All resources are publicly available to support future research.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: adversarial attacks/examples/training

Contribution Types: Publicly available software and/or pre-trained models, Theory

Languages Studied: English

Submission Number: 3860

Loading