Language Models Understand Themselves Better: A Zero-Shot AI-Generated Text Detection Method via Reading and Writing
Abstract: The rapid development and widespread adoption of large language models (LLMs) in recent years have introduced significant risks, necessitating robust detection methods to distinguish between AI-generated content and human-written text. Traditional training-based approaches often lack flexibility and frequently make predictions without supporting evidence, especially when adapting to new domains, leading to a lack of interpretability. To address this issue, we propose a novel zero-shot detection framework named Reading and Writing detection method. Our approach utilizes an autoregressive model to assess the intrinsic complexity of text, while leveraging an autoencoder model to quantify the difficulty of reconstructing the text. By integrating these two metrics, we effectively highlight the substantial differences between machine-generated and human-written text. We conduct extensive experiments on four large public datasets from state-of-the-art LLMs, including GPT-3.5, GPT-4, and open-source models like LLaMa. The results demonstrate that our detection method shows tremendous potential across various language generation models and text domains.
External IDs:dblp:conf/ijcnn/HuangZLLSCL25
Loading