Keywords: Higher Criticism; prompt; Topic-SCORE; authorship attribution
TL;DR: We introduce a new method for detecting AI-edited text. It combines sparse feature selection with prompting.
Abstract: Existing AI-text detectors have reported a great success in detecting AI-generated content created by text completion and question answering. We consider a more challenging problem---distinguishing between human-written content and human-written, AI-edited content (hwAI-generated text), in which the signals are weaker and existing methods are less satisfying. We propose {\it word-list-assisted prompting} as a new method. It is based on two empirical observations: (i) Word-count features, despite being sparse, are powerful in detecting hwAI-generated text. (ii) The direct prompting approach, though conventionally not recommended, becomes effective after being supplied a selected word list in the prompt. To this end, we develop two feature selection methods, leveraging the advancement in large-scale multiple testing and topic modeling. Our prompting approach, powered by these feature selection methods, achieves appealing performance in detecting hwAI-generated text in several data sets containing academic abstracts, movie reviews, and news.
Supplementary Material: pdf
Primary Area: interpretability and explainable AI
Submission Number: 4598
Loading