When Machines Write: A method for detecting AI-edited text

Jiashun Jin; Tracy Ke; Gabriel Moryoussef

When Machines Write: A method for detecting AI-edited text

Jiashun Jin, Tracy Ke, Gabriel Moryoussef

13 Sept 2025 (modified: 29 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Higher Criticism; prompt; Topic-SCORE; authorship attribution

TL;DR: We introduce a new method for detecting AI-edited text. It combines sparse feature selection with prompting.

Abstract: Existing AI-text detectors have reported a great success in detecting AI-generated content created by text completion and question answering. We consider a more challenging problem---distinguishing between human-written content and human-written, AI-edited content (hwAI-generated text), in which the signals are weaker and existing methods are less satisfying. We propose {\it word-list-assisted prompting} as a new method. It is based on two empirical observations: (i) Word-count features, despite being sparse, are powerful in detecting hwAI-generated text. (ii) The direct prompting approach, though conventionally not recommended, becomes effective after being supplied a selected word list in the prompt. To this end, we develop two feature selection methods, leveraging the advancement in large-scale multiple testing and topic modeling. Our prompting approach, powered by these feature selection methods, achieves appealing performance in detecting hwAI-generated text in several data sets containing academic abstracts, movie reviews, and news.

Supplementary Material: pdf

Primary Area: interpretability and explainable AI

Submission Number: 4598

Loading