Abstract: Large language models (LLMs) are increasingly integrated into academic workflows, with many conferences and journals permitting their use for tasks such as language refinement and literature summarization. However, their use in peer review remains prohibited due to concerns around confidentiality breaches, hallucinated content, and inconsistent evaluations. As LLM-generated text becomes more indistinguishable from human writing, there is a growing need for reliable attribution mechanisms to preserve the integrity of the review process. In this work, we evaluate topic-based watermarking (TBW), a semantic-aware technique designed to embed detectable signals into LLM-generated text. We conduct a systematic assessment across multiple LLM configurations, including base, few-shot, and fine-tuned variants, using authentic peer review data from academic conferences. Our results show that TBW maintains review quality relative to non-watermarked outputs, while demonstrating robust detection performance under paraphrasing. These findings highlight the viability of TBW as a minimally intrusive and practical solution for LLM attribution in peer review settings.
Paper Type: Short
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: ethical considerations in NLP applications,transparency,policy and governance,reflections and critiques
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Previous URL: https://openreview.net/forum?id=8mB27BawC4
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: We request different reviewers due to several procedural and quality concerns from the previous review round: 1. Double-blind review violation: Reviewer ftg6 explicitly stated "Knowledge Of Paper Source: Preprint on arxiv" and "Impact Of Knowledge Of Paper: Somehow", indicating they searched for and found our work online, compromising the anonymity essential to unbiased evaluation. This violates review protocols. 2. Potential conflict of interest: The same reviewer recommended incorporating evaluation against methods that appear to be their own unpublished work at the time (citing papers that were not publicly available until one week before our original submission deadline), creating a potential self-citation conflict. 3. Reviewer expertise mismatch: Reviewer cB1h gave a high overall assessment (4.0) but explicitly stated "Confidence: 1 = Not my area, or paper is very hard to understand. My evaluation is just an educated guess." This indicates insufficient expertise to properly evaluate technical contributions in this specialized area. 4. Inconsistent evaluation quality: The combination of low confidence with high scores, alongside the anonymity breach, suggests the previous review panel may not provide the rigorous, fair evaluation our substantially revised work deserves. We have made revisions addressing technical concerns raised (repositioned scope, moved baseline comparisons to main text, enhanced justification), and believe our work merits evaluation by reviewers who can assess it within proper double-blind protocols and with appropriate domain expertise.
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: “Limitations” and “Ethical Considerations” Sections (Page 5)
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Sections 2, 3, 4, Appendices B, C, D
B2 Discuss The License For Artifacts: No
B2 Elaboration: The paper does not discuss the license or terms for use and/or distribution of any artifacts because the work primarily focuses on adapting and evaluating existing methods, such as Topic-Based Watermarking and open-source language models like Llama-3.1-8B, without explicitly releasing new datasets, software, or models. As no new artifacts are distributed in the current submission, the authors have considered licensing details to be out of scope. However, if artifacts are released in the future, appropriate licensing information should be provided to ensure clarity and reproducibility.
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Sections 2, 3, 4, Appendices B, C, D
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Section 3.1.1.
B5 Documentation Of Artifacts: No
B5 Elaboration: The paper does not provide formal documentation of the artifacts in terms of domain coverage, languages, linguistic phenomena, or demographic groups. This is because the dataset was sourced from a well-known public platform (OpenReview) and used in a narrowly defined domain, academic peer reviews from machine learning conferences. Since the data consists only of titles, abstracts, and reviews, and the focus of the paper is on evaluating a technical method (topic-based watermarking). Additional documentation unnecessary.
B6 Statistics For Data: Yes
B6 Elaboration: Appendix C
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Appendix C
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Appendices C, F
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4, Appendices D, E, F
C4 Parameters For Packages: Yes
C4 Elaboration: Section 3, 4, Appendix D
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: No
E1 Elaboration: In accordance with ARR policy and to preserve anonymity during double-blind review, we have not included this information in the current version of the submission. As required by ARR's anonymization guidelines, the acknowledgments section has been omitted to avoid any potential identity disclosure. If the paper is accepted, we will include an appropriate acknowledgment detailing the use of AI assistants (e.g., for writing and coding assistance), in compliance with the ACL Policy on AI Writing Assistance.
Author Submission Checklist: yes
Submission Number: 752
Loading