Abstract: Limited availability of multilingual text corpora for training language models often leads to poor performance on downstream tasks due to undertrained representation spaces for languages other than English. This 'under-representation' has motivated recent cross-lingual transfer methods to leverage the English representation space by e.g. mixing English and non-English tokens at input or extending model parameters, which in turn increases computational complexity. To address this, we introduce Fusion for Language Representations (FLARE) in adapters, a method designed to improve both the representation quality and downstream performance for languages other than English. FLARE integrates source and target language representations within the bottlenecks of low-rank LoRA adapters using lightweight linear transformations. This maintains parameter efficiency as the method does not require additional parameters, while improving transfer performance, further narrowing the performance gap to English.
Another key advantage of the proposed latent representation fusion is that it does not increase the number of input tokens, thus maintaining computational efficiency. Moreover, FLARE provides flexibility to integrate various types of representations, e.g., we show that it is possible to fuse latent translations extracted from machine translation models.
Our results demonstrate FLARE's effectiveness on natural language understanding tasks, reducing the performance gap to English across all tasks.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: cross-lingual transfer, multilingual representations, less-resourced languages
Contribution Types: NLP engineering experiment
Languages Studied: Acehnese, Arabic, Balinese, Banjarese, Bengali, Buginese, Bulgarian, Chinese, Finnish, French, German, Greek, Hindi, Indonesian, Javanese, Korean, Madurese, Minangkabau, Ngaju, Russian, Spanish, Swahili, Telugu, Thai, Turkish, Urdu, Vietnamese
Submission Number: 1738
Loading