Abstract: Short-Answer Grading (SAG) is an application for NLP in education where studentanswers to open questions are graded. This task places high demands both on thereliability (accuracy and fairness) of label predictions and model robustnessagainst strategic, "adversarial" input. Neural approaches are powerful tools formany problems in NLP, and transfer learning for Transformer-based modelsspecificially promises to support data-poor tasks as this. We analyse theperformance of a Transfomer-based SOTA model, zooming in on class- and item typespecific behavior in order to gauge reliability; we use adversarial testing toanalyze the the model's robustness towards strategic answers. We find a strongdependence on the specifics of training and test data, and recommend that modelperformance be verified for each individual use case.
0 Replies
Loading