Abstract: This study focuses on evaluating and predicting the intelligibility of non-compositional expressions within the context
of five closely related Slavic languages: Belarusian, Bulgarian, Czech, Polish, and Ukrainian, as perceived by native
speakers of Russian. Our investigation employs a web-based experiment where native Russian respondents take
part in free-response and multiple-choice translation tasks. Based on the previous studies in mutual intelligibility
and non-compositionality, we propose two predictive factors for reading comprehension of unknown but closely
related languages: 1) linguistic distances, which include orthographic and phonological distances; 2) surprisal scores
obtained from monolingual Language Models (LMs). Our primary objective is to explore the relationship of these
two factors with the intelligibility scores and response times of our web-based experiment. Our findings reveal that,
while intelligibility scores from the experimental tasks exhibit a stronger correlation with phonological distances, LM
surprisal scores appear to be better predictors of the time participants invest in completing the translation tasks.
Loading