Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Anonymous

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages from the pre-training corpus. We first show that many languages transfer some instruction-following capabilities to other languages from even monolingual tuning. Furthermore, we find that only 40 multilingual examples integrated in an English tuning set substantially improve multilingual instruction-following, both in seen and unseen languages during tuning. In general, we observe that models tuned on multilingual mixtures exhibit comparable or superior performance in multiple languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Finally, we find that diversifying the instruction tuning set with even just 2-4 languages significantly improves cross-lingual generalization. Our results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses.

Paper Type: long

Research Area: Multilinguality and Language Diversity

Contribution Types: NLP engineering experiment

Languages Studied: Arabic, Chinese, Czech, English, Estonian, Finnish, Hebrew, Hindi, Italian, Russian, Spanish, Swahili

Preprint Status: There is a non-anonymous preprint (URL specified in the next question).

A1: yes

A1 Elaboration For Yes Or No: Section 7

A2: no

A2 Elaboration For Yes Or No: We do not identify risks that our work adds on to the current risks of LLMs

A3: yes

A3 Elaboration For Yes Or No: Section 1

B: yes

B1: yes

B1 Elaboration For Yes Or No: Section 2,3,4

B2: no

B2 Elaboration For Yes Or No: We use datasets that are publicly available for scientific purposes, and do not release artifacts.

B3: no

B3 Elaboration For Yes Or No: We use the publicly available academic datasets consistently with their intended use.

B4: no

B4 Elaboration For Yes Or No: We use the publicly available academic datasets that are being widely used by the research community.

B5: no

B5 Elaboration For Yes Or No: We use the publicly available academic datasets that are being widely used by the research community.

B6: yes

C: yes

C1: no

C1 Elaboration For Yes Or No: We are not allowed to disclose the exact number of parameters the model has.

C2: yes

C3: yes

C4: n/a

D: yes

D1: yes

D2: no

D2 Elaboration For Yes Or No: The annotators are colleges (engineers) that volunteered to help with our study

D3: no

D3 Elaboration For Yes Or No: The annotators are colleges (engineers) that volunteered to help with our study

D4: no

D4 Elaboration For Yes Or No: The annotators are colleges (engineers) that volunteered to help with our study, and the data collected was only their preferences regarding different model responses to prompts that are used by the research community for models evaluation this way.

D5: no

D5 Elaboration For Yes Or No: The annotators are colleges (engineers) that volunteered to help with our study

E: yes

E1: no

E1 Elaboration For Yes Or No: We used AI assistants for technical help with our plots

0 Replies

Loading