Safeguarding Scientific Peer Review through Watermarking of LLM-Generated Text

Safeguarding Scientific Peer Review through Watermarking of LLM-Generated Text

ACL ARR 2026 January Submission5982 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Peer Review, Ethics, Scientific document processing, LLM

Abstract: The integrity of peer review is essential to uphold scientific rigor and trust in academic publishing. Traditionally dependent on domain experts’ critical judgment, the process now faces growing strain due to surging submission volumes and limited reviewer availability. This has led to the rise of lazy reviewing, where reviewers rely on large language models (LLMs) to generate reviews---raising concerns over fairness, accountability, and the authenticity of evaluations. Prior efforts have focused on detecting AI-generated content or estimating its prevalence in reviews, yet existing detectors remain vulnerable to adversarial attacks and often require domain-specific retraining. To address these limitations, we introduce a novel peer review watermarking framework that embeds traceable, query-aware signals into generated reviews without compromising scientific coherence. Our approach integrates a Query-Aware Response Generation module with a watermark detection mechanism that enables editors to verify review authenticity reliably. Comprehensive experiments on ICLR and NeurIPS datasets demonstrate that our framework surpasses existing AI-text detectors under diverse adversarial conditions. We hope this work advances the responsible integration of LLMs into scholarly communication. We make our code and datasets publicly available.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Ethics, Bias, and Fairness, NLP Applications, Generation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 5982

Loading