CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues

CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues

ACL ARR 2024 August Submission68 Authors

12 Aug 2024 (modified: 20 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We develop CNIMA (**C**hinese **N**on-Native **I**nteractivity **M**easurement and **A**utomation), a Chinese-as-a-second-language labelled dataset with 10K dialogues. We annotate CNIMA using an evaluation framework --- originally introduced for English-as-a-second-language dialogues --- that assesses micro-level features (e.g. backchannels) and macro-level interactivity labels (e.g. topic management) and test the framework's transferability from English to Chinese. We found the framework robust across languages and revealed universal and language-specific relationships between micro-level and macro-level features. Next, we propose an approach to automate the evaluation and find strong performance, creating a new tool for automated second language assessment. Our system can be adapted to other languages easily as it uses large language models and, as such, does not require large-scale annotated training data.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: multilingual / low resource, evaluation and metrics, spoken dialogue systems, conversation, applications

Contribution Types: Model analysis & interpretability, Reproduction study, Approaches to low-resource settings, Approaches low compute settings-efficiency, Data resources, Data analysis

Languages Studied: Chinese Second Language, English Second Language

Submission Number: 68

Loading