Evade ChatGPT Detectors via A Single Space

Anonymous

Evade ChatGPT Detectors via A Single Space

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: ChatGPT brings significant social value but also raises concerns about the misuse of AI-generated text. Consequently, an important problem is how to detect whether texts are generated by ChatGPT or by human. Although automated detection methods have been proposed, we find that these detectors do not effectively discriminate the semantic and stylistic gaps between human-generated and AI-generated text. Instead, the ``subtle differences'', such as {\it an extra space}, become crucial for detection. Based on this discovery, we propose the SpaceInfi strategy to evade detection. Experiments demonstrate the effectiveness of this strategy across multiple benchmarks and detectors. And we empirically show that a phenomenon called {\it token mutation} causes the evasion for language model-based detectors.

Paper Type: short

Research Area: Ethics, Bias, and Fairness

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

0 Replies

Loading