What Are We Detecting, Really? LLM-Generated Text Detection Remains an Unsolved Problem

23 May 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025 Position Paper TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, LLM-Generated Text, machine-generated text, human-in-the-loop, detectors
TL;DR: In most practical cases, it is not possible to accurately detect LLM-generated text.
Abstract: This position paper argues that, in most practical cases, it is not possible to accurately detect LLM-generated text. We consider that "LLM-generated text" refers to the content produced by LLMs through normal prompts. As implied by the names "LLM-generated text" and "human-written text'', the difference lies in how they are produced, but in practice, we can only evaluate them based on the final output—the text itself—where there is often significant overlap between human- and machine-generated content.. The numerical results of LLM-generated text detection are often misunderstood and their significance is diminishing. The detectors can serve a purpose under specific conditions, whose results should only be used as a reference with greater caution rather than the decisive indicator.
Submission Number: 745
Loading