Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text

ACL ARR 2025 February Submission3982 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The wide usage of LLMs raises critical requirements on detecting AI participation in texts. Existing studies investigate these detections in scattered contexts, leaving a systematic and unified approach unexplored. In this paper, we present \emph{HART}, a hierarchical framework of AI risk levels, each corresponding to a detection task. To address these tasks, we propose a novel \emph{2D Detection Method}, decoupling a text into content and language expression. Our findings show that content is resistant to surface-level changes, which can serve as a key feature for detection. Experiments demonstrate that 2D method significantly outperforms existing detectors, achieving an AUROC improvement from 0.705 to 0.849 for level-2 detection and from 0.807 to 0.886 for RAID. We release our data and code at \url{https://github.com/xxxx}.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: AI-generated text detection
Contribution Types: Publicly available software and/or pre-trained models, Data resources
Languages Studied: English, Chinese, French, Spanish, Arabic
Submission Number: 3982
Loading