C-Net: A Compression-Based Lightweight Network for Machine-Generated Text Detection

Yinghan Zhou; Juan Wen; Jianghao Jia; Liting Gao; Ziwei Zhang

C-Net: A Compression-Based Lightweight Network for Machine-Generated Text Detection

Yinghan Zhou, Juan Wen, Jianghao Jia, Liting Gao, Ziwei Zhang

Published: 01 Jan 2024, Last Modified: 20 May 2025IEEE Signal Process. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In recent years, large language models (LLM) have progressed rapidly, leading to growing concerns about the proliferation of difficult-to-distinguish AI-generated content. This has given rise to a range of issues, including fake news, academic fraud, phishing emails, posing significant dangers across various domains. However, current machine-generated text (MGT) detection methods still face challenges, including the need to access model's output logits or losses, which makes it unable to adapt to black-box scenarios in the real world, and difficult to deploy models with large parameter sizes. Therefore, we propose a compression-based lightweight network for MGT detection that leverages the ability of lossless compression to effectively extract features between categories. With fewer parameters, our framework achieves state-of-the-art performance in MGT detection under black box conditions. Experiments demonstrate that our approach performs exceptionally well on both Chinese and English datasets. Specifically, our method achieves a full-text detection accuracy of 99.5%, surpassing the previous SOTA method.

Loading