Towards Robust Cross-Prompt Essay Trait Scoring: A Generative Model Framework with Ranking Loss

ACL ARR 2024 June Submission5530 Authors

16 Jun 2024 (modified: 08 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Automated Essay Scoring (AES) aims to evaluate the overall quality of essays, while essay trait scoring provides a detailed assessment by assigning separate scores to specific traits. Prompt-specific AES models have shown success, but their application to ``unseen'' prompts remains challenging due to limited prompt and essay diversity, hindering the generalization ability. This paper introduces GenAES, a generative model framework for cross-prompt essay trait scoring, leveraging large language models (LLMs) to augment prompts and essays. GenAES further develops a prompt encoder to manage representations of unseen prompts and introduces a ranking loss to evaluate the similarity of unlabeled generated essays with the source essays. Experimental results show GenAES improves generalization, achieving state-of-the-art performance on the ASAP++ dataset, with 6.5% and 7.3% gains in average QWK scores over prompts and traits, respectively. The generated prompts and essays are released to facilitate future research.
Paper Type: Short
Research Area: Human-Centered NLP
Research Area Keywords: Language Learning, Psycholinguistics
Languages Studied: English
Submission Number: 5530
Loading