Telescope: Improving Zero Shot Detection of LLM Generated Content With Token Repetition

Telescope: Improving Zero Shot Detection of LLM Generated Content With Token Repetition

ICLR 2026 Conference Submission22321 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Fairness, Accountability, and Transparency, Generative Models, Text Analysis, Natural Language Processing, Benchmarks, Learning Theory

TL;DR: We find LLMs have a unique aversion to repeating words (a "Vestigial Heuristic" from early training) and develop a highly effective method to detect LLM-generated text by measuring these repetition patterns.

Abstract: Distinguishing Large Language Model (LLM) generated text from human writing is a critical and difficult challenge. While LLMs are trained to write like humans, we hypothesize that this training leaves an indelible mark. LLMs develop a particularly strong aversion to token repetition very early in training. This bias persists as a "Vestigial Heuristic'' (a developmental artifact) that is activated in LLM-generated text, separating LLM from human writing. To probe this phenomenon, we introduce Telescope Perplexity, a metric that evaluates the token repetition of the model, $P(s_i | s_{1:i})$. Our empirical investigation reveals that the Telescope Perplexity signature emerges early in pre-training, and Telescope Perplexity empirically enables highly effective zero-shot LLM detection. We show state-of-the-art or competitive performance across diverse datasets (including modern evaluation sets we introduce), reference models, and perturbation schemes with greater efficiency than other methods.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 22321

Loading