Improvements to an Automated Content Scoring System for Spoken CALL Responses: the ETS Submission to the Second Spoken CALL Shared Task

Keelan Evanini, Matthew Mulholland, Rutuja Ubale, Yao Qian, Robert A. Pugh, Vikram Ramanarayanan, Aoife Cahill

2018 (modified: 08 Nov 2021)INTERSPEECH 2018Readers: Everyone

Abstract: This paper describes the details of the ETS submission to the 2018 Spoken CALL Shared Task. We employed a system using word and character n-gram features in a random forest machine learning framework based on the system that achieved the second-highest score in the text processing track of the 2017 Spoken CALL Shared Task. This system was augmented with additional features based on comparing the learner's responses to language models trained on text written by both native English speakers and L1-German English learners. In addition, we developed a set of sequence-to-label models using bidirectional LSTM-RNNs with an attention layer. The RNN model predictions were combined with the other feature sets using feature-level and score-level fusion approaches resulting in a best-performing system that achieved a D score of 7.397 on the test set (ranking 5th out of 12 submissions to the text processing track of the Shared Task). Subsequent experiments resulted in higher D scores when the model parameters were optimized for D score instead of F-score and the paper presents an error analysis of these models in an attempt to determine which metric is more appropriate for evaluating spoken CALL systems.

0 Replies