Justice in Judgment: Unveiling (Hidden) Bias in LLM-assisted Peer Reviews

ACL ARR 2025 July Submission1090 Authors

29 Jul 2025 (modified: 07 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The adoption of large language models (LLMs) is transforming the peer review process, from assisting reviewers in writing more detailed evaluations to generating entire reviews automatically. While these capabilities offer exciting opportunities, they also raise critical concerns about fairness and reliability. In this paper, we investigate bias in LLM-generated peer reviews by conducting controlled experiments on sensitive metadata, including author affiliation and gender. Our analysis consistently shows affiliation bias favoring institutions highly ranked on common academic rankings. Additionally, we find some gender preferences, which, even though subtle in magnitude, have the potential to compound over time. Notably, we uncover implicit biases that become more evident with token-based soft ratings.
Paper Type: Short
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Fairness
Contribution Types: Model analysis & interpretability
Languages Studied: English
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Limitations and Ethics sections.
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: References section
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Appendix C and J
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Appendix J
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Appendices D, E
B5 Documentation Of Artifacts: Yes
B5 Elaboration: Section 2.3, Appendix B, D, E
B6 Statistics For Data: Yes
B6 Elaboration: Section 2.3
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Appendix C.
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 2 and Appendix B, D, E
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 3
C4 Parameters For Packages: Yes
C4 Elaboration: Section 3 and Appendix C.
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1090
Loading