Reliability and Robustness of Transformers for Automated Short-Answer GradingDownload PDF

Anonymous

17 Aug 2021 (modified: 05 May 2023)ACL ARR 2021 August Blind SubmissionReaders: Everyone
Abstract: Short-Answer Grading (SAG) is an application for NLP in education where studentanswers to open questions are graded. This task places high demands both on thereliability (accuracy and fairness) of label predictions and model robustnessagainst strategic, "adversarial" input. Neural approaches are powerful tools formany problems in NLP, and transfer learning for Transformer-based modelsspecificially promises to support data-poor tasks as this. We analyse theperformance of a Transfomer-based SOTA model, zooming in on class- and item typespecific behavior in order to gauge reliability; we use adversarial testing toanalyze the the model's robustness towards strategic answers. We find a strongdependence on the specifics of training and test data, and recommend that modelperformance be verified for each individual use case.
0 Replies

Loading