Diversity Boosts AI-Generated Text Detection

Advik Raj Basani; Pin-Yu Chen

Diversity Boosts AI-Generated Text Detection

Advik Raj Basani, Pin-Yu Chen

Published: 10 Jun 2025, Last Modified: 13 Jul 2025DIG-BUG ShortEveryoneRevisionsBibTeXCC BY 4.0

Keywords: llms, ai text detection, interpretability, zero-shot

TL;DR: DivEye detects AI-generated text by measuring surprisal-based diversity, exploiting the greater variability and irregularity of human writing compared to machine-generated text.

Abstract: Detecting AI-generated text is increasingly important to prevent misuse in education, journalism, and social media, where synthetic fluency can obscure misinformation. Existing detectors often rely on likelihood heuristics or black-box classifiers, which struggle with high-quality outputs and lack interpretability. We propose *DivEye*, a novel detection framework that leverages surprisal-based features to capture fluctuations in lexical and structural unpredictability, a signal more prominent in human-authored text. *DivEye* outperforms existing zero-shot detectors by up to 33.2%, matches fine-tuned baselines, and boosts existing detectors by up to 18.7% when used as an auxiliary signal. *DivEye* is also robust to paraphrasing and adversarial attacks, generalizes across domains, and offers interpretable insights into rhythmic unpredictability as a key indicator of AI-generated text.

Submission Number: 13

Loading