Is your Reviewer 2 Human? Tracing a Novel Problem of LLM Authorship in Academic Peer Reviews

ACL ARR 2025 May Submission6457 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The integrity of the peer review process is crucial for maintaining scientific rigor and trust in academic publishing. This process relies on (human) domain experts critically evaluating the merit of the submitted manuscripts. However, the peer review system faces growing strain from increased submissions and limited reviewer availability, prompting *lazy reviewing* practices in which reviewers use large language models (LLMs) to generate reviews, raising concerns about the quality, reliability, and accountability of those evaluations. Previous work has focused on estimating the proportion of AI-generated peer reviews or developing AI-generated text detectors. However, existing detectors are not resistant to adversarial attacks and often require domain or model-specific retraining. To address these challenges, we propose a framework for peer review watermarking. Our method includes a Query-Aware Response Generation module that selectively embeds subtle yet detectable signals while preserving scientific terminology, based on the user's submission of a research paper, along with a watermarking detection mechanism that enables editors to reliably verify the authenticity of reviews. Extensive experiments on ICLR and NeurIPS data demonstrate that our method outperforms various AI text detectors under adversarial attacks. We hope that this work will facilitate the further development of watermarking and responsible use of LLM systems. We make our code and dataset public.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: educational applications, ethical considerations in NLP applications, transparency
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 6457
Loading