Efficient Zero-Shot Cross-lingual Inference via Retrieval

Yifan Gao, Genta Indra Winata, Lingjue Xie, Karthik Radhakrishnan, Daniel Preotiuc-Pietro

Published: 01 Nov 2023, Last Modified: 16 Apr 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Resources for building NLP applications, such as data and models, are usually only created and curated for a limited set of high resource languages. Thus, the ability to transfer knowledge to a new language is a key way in which to enable access to NLP technology for a wider population. This paper presents a framework to perform zero-shot inference in a target language by using cross-lingual retrieval from another language where limited annotated data for a comparable domain is available. Results on two large-scale multilingual datasets show that, in this setup, this framework improves over fine-tuning multilingual models or translating annotated data, and achieves results relatively close to fine-tuning the model on the target language directly. These results show that models can be transferred efficiently across languages for a given task and domain, even for languages not covered by multilingual model training approaches.