Cross-Linguistic Intelligibility of Non-Compositional Expressions in Spoken Context

Published: 04 Sept 2024, Last Modified: 09 Dec 2024Interspeech 2024EveryoneCC BY 4.0
Abstract: This study investigates intelligibility of non-compositional expressions in spoken context for five closely related Slavic languages (Belarusian, Bulgarian, Czech, Polish, and Ukrainian) by native Russian speakers. Our investigation employs a web-based experiment involving free-response and multiple-choice translation tasks. Drawing on prior research, two factors were examined: (1) linguistic similarities (orthographic and phonological distances), and (2) surprisal scores obtained from two multilingual speech representation (SR) models fine-tuned for Russian (Wav2Vec2-Large-Ru-Golos-With-LM and Whisper Medium Russian). According to the results of Pearson correlation and regression analyses, phonological distance appears to be a better predictor of intelligibility scores than SR surprisal.
Loading