The Obscure Limitation of Modular Multilingual Language Models

Muhammad Farid Adilazuarda; Samuel Cahyawijaya; Ayu Purwarianti

The Obscure Limitation of Modular Multilingual Language Models

Muhammad Farid Adilazuarda, Samuel Cahyawijaya, Ayu Purwarianti

01 Mar 2023 (modified: 04 Aug 2025)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone

Keywords: multilinguality, modular language models, language identification

TL;DR: Limitation of modular multilingual language models on unknown languages, effect of adding language identification modules to improve performance, and discusses ways to close the performance gap caused by the pipelined approach of LID and MLMs.

Abstract: We expose the limitation of modular multilingual language models (MLMs) in multilingual inference scenarios with unknown languages. Existing evaluations of modular MLMs exclude the involvement of language identification (LID) modules, which obscures the performance of real-case multilingual scenarios of modular MLMs. In this work, we showcase the effect of adding LID on the multilingual evaluation of modular MLMs and provide discussions for closing the performance gap of caused by the pipelined approach of LID and modular MLMs.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/the-obscure-limitation-of-modular/code)

5 Replies

Loading