HAPDA: A Human-Machine Predictive Discrepancy Adapter for AI-Generated Text Detection

Lei Jiang; Desheng Wu; Xiaolong Zheng; Cuicui Luo

HAPDA: A Human-Machine Predictive Discrepancy Adapter for AI-Generated Text Detection

Lei Jiang, Desheng Wu, Xiaolong Zheng, Cuicui Luo

16 Sept 2025 (modified: 04 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI-generated text detection, prediction discrepancy modeling, fine-tune

Abstract: Recent advances in large language models (LLMs) have enabled them to generate text with increasingly human-like linguistic styles, posing significant challenges for AI-generated text detection (AGTD). Mainstream zero-shot AGTD methods primarily compute token-level AI-likeness scores from a machine-centric perspective represented by proxy models, and treat all tokens equally in the overall detection score calculation. However, these methods overlook predictive discrepancies between humans and LLMs in interpreting the same text. Our key intuition is that tokens exhibiting greater divergence in human and machine predictions offer stronger cues for authorship attribution. To address this limitation, we propose \textbf{HAPDA}, a \underline{h}uman-m\underline{a}chine \underline{p}redictive \underline{d}iscrepancy \underline{a}dapter for the AGTD task. HAPDA consists of (i) a joint fine-tuning strategy for training paired human and machine preference models, and (ii) a discrepancy-aware reweighting mechanism to calibrate token-level detection scores in downstream detectors. Extensive experiments across multiple datasets demonstrate that HAPDA consistently and significantly improves the performance of five representative baselines under diverse evaluation settings.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 7630

Loading