Abstract: The growing use of large language models (LLMs) in academic peer review poses significant challenges, particularly in distinguishing AI-generated content from human-written feedback. This research addresses the problem of identifying AI-generated peer review comments, which are crucial to maintaining the integrity of scholarly evaluation. Prior research has primarily focused on generic AI-generated text detection or on estimating the fraction of peer reviews that may be AI-generated, often treating reviews as monolithic units. However, these methods fail to detect finer-grained AI-generated points within mixed-authorship reviews. To address this gap, we propose MixRevDetect, a novel method to identify AI-generated points in peer reviews. Our approach achieved an F1 score of 88.86%, significantly outperforming existing AI text detection methods.
Loading