Simulating bilingual reading with language models: Effects of word frequency on cognate and interlingual homograph processing
Keywords: bilingualism, language models, surprisal, cognates, interlingual homographs
TL;DR: We train bilingual language models to simulate bilingual processing of cognates and interlingual homographs, specifically (non)selective processing and find inhibition effects for both as well as language-specific frequency effects.
Abstract: We train Dutch-English bilingual LMs and their monolingual counterparts to model processing of cognates and interlingual homographs. We focus on the effect of word frequency: a higher frequency of cognates in bilingual language exposure is one account for the facilitation effect that is observed in bilingual speakers (in contrast to non-cognate controls or to monolingual speakers). Interlingual homographs (IHs) tend to lead to inhibition. Previous work also supports the language non-selective account, i.e. both languages are activated in the bilingual mind. The target languages (= language of the sentences) are English (for cognate items) and Dutch (for IH items).
We test this by training the mentioned LMs and calculating estimated processing effort (with surprisal) for cognates and IHs. We find that our bilingual LMs show language-specific frequency effect with only the target-language word frequency modulating surprisal (and not the frequency in the other language). We also find inhibition for both cognates and IHs.
Submission Number: 28
Loading