OpenReviewer: Predicting Conference Decisions with LLMs and Beyond

OpenReviewer: Predicting Conference Decisions with LLMs and Beyond

ICLR 2026 Conference Submission18772 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large language models; Acceptance Prediction; Multimodal;

Abstract: The rapid growth of AI conference submissions has strained the peer-review system, motivating interest in AI-assisted review. Yet it remains unclear how reliably such systems approximate human judgment, which relies on domain expertise and nuanced reasoning. To address this challenge,, we introduce \open, a model designed to directly pred conference acceptance decisions rather than generate full reviews. Using ICLR 2024–2025 data, we evaluate large language models (LLMs), vision–language models (VLMs), and interpretable statistical models. Results show that text-only LLMs with continual pre-training outperform multimodal counterparts, achieving up to 78.5\% accuracy on balanced datasets (vs.\ 50\% random baseline). White-box statistical models further provide interpretability through feature analysis, revealing that structural attributes (e.g., paper length, section balance, citation engagement) are consistently predictive. Beyond average accuracy, a confidence-stratified utility analysis shows that the top 10\% most confident predictions reach 92.92\% precision, enabling reliable triage of ``obvious” accepts and rejects while exposing areas of uncertainty. Overall, our findings demonstrate both the promise and limitations of AI-involved peer review: current models can reduce workload and aid submission reviewing, but fall short of reliably replacing expert judgment.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 18772

Loading