Gen-Review: A Dataset and Large-scale Study of AI-Generated and Human-Authored Peer Reviews

08 May 2025 (modified: 30 Oct 2025)Submitted to NeurIPS 2025 Datasets and Benchmarks TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Peer-review, LLM, Dataset, GenAI
TL;DR: A Dataset and Large-scale Study of AI-Generated and Human-Authored Peer Reviews
Abstract: How does the increased adoption of Large Language Models (LLMs) impact the scientific peer review? This multifaceted question is fundamental to the integrity and outcomes of the scientific process. Timely evidence suggests LLMs may have already been used for peer-review, e.g., at the 2024 International Conference of Learning Representations (ICLR), and the LLMs' integration in peer-review was confirmed by various editorial boards (including that of ICLR'25). To seek answers, a comprehensive dataset is needed, but lacking until now. We therefore present _Gen-Review_ the largest dataset of LLM-written reviews so far. Our dataset includes 81K reviews generated for all submissions to the 2018--2025 editions of the ICLR and by providing the LLM with three independent prompts: a negative, a positive, and a neutral one. _Gen-Review_ also links to the papers and the conference reviews thereby enabling a broad range of investigations. We make a start and use _Gen-Review_ to scrutinize: if LLMs exhibit bias in reviewing (they do); if LLM-written reviews can be automatically detected (so far, they can); if LLMs can rigorously follow reviewing instructions (not always) and whether LLM-provided ratings align with a papers' final outcome (happens only for accepted papers). Link to _Gen-Review_: https://anonymous.4open.science/r/gen_review/.
Croissant File: json
Dataset URL: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PYDPEZ
Code URL: https://anonymous.4open.science/r/gen_review
Supplementary Material: zip
Primary Area: Social and economic aspects of datasets and benchmarks in machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 804
Loading