Detection of Partially-Synthesized LLM Text

Published: 09 Oct 2024, Last Modified: 04 Dec 2024SoLaR PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Technical
Keywords: LLM text detection, black-box detection, mixed texts, distribution shift
Abstract: Advances in large language models (LLM) have produced artificial text that appear increasingly human-like and difficult to detect with the human eye. In order to improve LLMs' safety and mitigate potential nefarious uses, it has been desirable to develop automated detectors that can differentiate human and LLM-written text. While recent work has focused on classifying entire text samples (e.g., paragraphs) as human or LLM-written, this paper investigates the setting where the text's individual segments (e.g., sentences) could each be written by either a human or LLM. We study two relevant problems: (i) estimating the percentage of a text that was LLM-written, and (ii) determining which segments were LLM-written. To this end, we propose Partial-LLM Detector (PaLD), a black-box method that leverages the scores of text classifiers. Experimentally, we demonstrate the effectiveness of PaLD compared to baseline methods that build on prior text detectors.
Submission Number: 55
Loading