Abstract: This position paper presents a novel perspective on the utilization of Large Language Models (LLMs) in the artificial intelligence paper review process. We first critique the current tendency for LLMs to be primarily used for simple review text generation, arguing instead that this approach overlooks more meaningful applications of LLMs that preserve human expertise at the core of evaluation. Instead, we advocate for leveraging LLMs to support key aspects of the review process—specifically, verifying the reproducibility of experimental results, checking the correctness and relevance of citations, and assisting with ethics review flagging. For example, integrating tools based on LLM Agents for code generation from research papers has recently enabled automated assessment of the reproducibility of the paper, thereby improving the transparency and reliability of research. By reorienting LLM usage toward these targeted and assistive roles, we outline a pathway for more effective and responsible integration of LLMs into peer review, ultimately supporting both reviewer efficiency and the integrity of the scientific process.
Loading