Abstract: This paper presents a comparative study of two approaches to automated assessment of Czech spoken language exams for non-native speakers: one using large language models applied to transcripts, and the other based on pre-trained speech encoder models. To our knowledge, this is the first study to explore automatic speaking assessment (ASA) for the Czech language. We evaluate both methods on a dataset of authentic high-stakes oral exams, annotated with binary pass/fail labels and total exam scores. Our best-performing models reach a QWK score of 0.65. Our experiments demonstrate the feasibility of applying ASA techniques in Czech and illustrate challenges related to data scarcity, transcription quality, and performance variability between input types.
External IDs:dblp:conf/tsd/PolakNRRB25
Loading