MASGrader: A Multi-Agent Framework for Automated Subjective Question Grading

MASGrader: A Multi-Agent Framework for Automated Subjective Question Grading

ACL ARR 2025 February Submission3426 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Automated grading of subjective questions remains a significant challenge in educational assessment. Traditional manual grading is inefficient and inconsistent, while existing AI-based methods lack flexibility and robustness in handling diverse answers. This paper introduces MASGrader, an innovative multi-agent framework for grading subjective answers, consisting of four agents: the Overview Agent, which performs macroscopic evaluation; the Detail Review Agent, which conducts microscopic reviews; the Logical Validation Agent, which checks semantic and logical consistency; and the Supervisory Agent, which coordinates debates and makes final decisions. MASGrader enhances grading accuracy, stability, and transparency by simulating human-like debate and reflection mechanisms. Experiments on a dataset of 500 subjective answers demonstrate that MASGrader improves weighted Kappa scores and accuracy by 5-10\% compared to a single-agent baseline while generating detailed scoring rationales that increase interpretability. By introducing dynamic collaboration, logical validation, and iterative self-improvement, this multi-agent framework provides a reliable solution for high-stakes educational assessment.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: Automated Grading, Multi-Agent Systems, Subjective Question Answering, Interpretability

Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models

Languages Studied: English, Chinese

Submission Number: 3426

Loading