Automated Speaking Assessment for L2 Learners of Czech

Peter Polák, Michal Novák, Katerina Rysová, Magdaléna Rysová, Ondrej Bojar

Published: 2025, Last Modified: 08 Jan 2026TSD (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper presents a comparative study of two approaches to automated assessment of Czech spoken language exams for non-native speakers: one using large language models applied to transcripts, and the other based on pre-trained speech encoder models. To our knowledge, this is the first study to explore automatic speaking assessment (ASA) for the Czech language. We evaluate both methods on a dataset of authentic high-stakes oral exams, annotated with binary pass/fail labels and total exam scores. Our best-performing models reach a QWK score of 0.65. Our experiments demonstrate the feasibility of applying ASA techniques in Czech and illustrate challenges related to data scarcity, transcription quality, and performance variability between input types.

External IDs:dblp:conf/tsd/PolakNRRB25