Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models

Oskar Holmström; Ehsan Doostmohammadi

Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models

Oskar Holmström, Ehsan Doostmohammadi

Published: 20 Mar 2023, Last Modified: 18 May 2023NoDaLiDa 2023Readers: Everyone

Keywords: Instruction finetuning, Cross-lingual, Multilingual, Instructions, GPT, GPT-SW3, GPT-3, OPT, zero-shot

TL;DR: We instruction finetune Swedish models on a translated dataset of generated instructions, significantly improving the models' zero-shot performance.

Abstract: In recent years, instruction finetuning models have received increased attention due to their remarkable zero-shot and generalization capabilities. However, the widespread implementation of these models has been limited to the English language, largely due to the costs and challenges associated with creating instruction datasets. To overcome this, automatic instruction generation has been proposed as a resourceful alternative. We see this as an opportunity for the adoption of instruction finetuning for other languages. In this paper we explore the viability of instruction finetuning for Swedish. We translate a dataset of generated instructions from English to Swedish, using it to finetune both Swedish and non-Swedish models. Results indicate that the use of translated instructions significantly improves the models' zero-shot performance, even on unseen data, while staying competitive with strong baselines ten times in size. We see this paper is a first step and a proof of concept that instruction finetuning for Swedish is within reach, through resourceful means, and that there exist several directions for further improvements.

Student Paper: Yes, the first author is a student

4 Replies

Loading