Abstract: The development of state-of-the-art generative large language models (LLMs) disproportionately relies on English-centric tokenizers, vocabulary and pre-training data. Despite the fact that some LLMs have multilingual capabilities, recent studies have shown that their inference efficiency deteriorates when generating text in languages other than English. This results in increased inference time and costs. Cross-lingual vocabulary adaptation methods have been proposed for adapting models to a target language aiming to improve downstream performance. However, the effectiveness of these methods on increasing inference efficiency of generative LLMs has yet to be explored. In this paper, we perform an empirical study of various cross-lingual vocabulary adaptation methods on five generative LLMs (including monolingual and multilingual models) across four typologically-diverse languages and four natural language understanding tasks. We find that cross-lingual vocabulary adaptation substantially contributes to LLM inference speedups of up to 271.5%. We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.
Paper Type: long
Research Area: Multilinguality and Language Diversity
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: German,Japanese,Arabic,Swahili
Preprint Status: We plan to release a non-anonymous preprint in the next two months (i.e., during the reviewing process).
A1: yes
A1 Elaboration For Yes Or No: Limitations
A2: no
A2 Elaboration For Yes Or No: We used publicly available models and datasets that are widely used in the research community.
A3: yes
A3 Elaboration For Yes Or No: Abstract and Section 1
B: yes
B1: yes
B1 Elaboration For Yes Or No: Section 4 and Appendix A
B2: yes
B2 Elaboration For Yes Or No: Appendix B
B3: yes
B3 Elaboration For Yes Or No: Appendix B
B4: no
B4 Elaboration For Yes Or No: We used publicly available models and datasets that are widely used in the research community.
B5: n/a
B5 Elaboration For Yes Or No: We used publicly available models and datasets that are widely used in the research community and did not create such artifacts.
B6: yes
B6 Elaboration For Yes Or No: Section 4 and Appendix A
C: yes
C1: yes
C1 Elaboration For Yes Or No: Section 4 and Appendix A
C2: yes
C2 Elaboration For Yes Or No: Section 4
C3: yes
C3 Elaboration For Yes Or No: Appendix C
C4: yes
C4 Elaboration For Yes Or No: Appendix A
D: no
D1: n/a
D2: n/a
D3: n/a
D4: n/a
D5: n/a
E: no
E1: n/a
0 Replies
Loading