Isomorphic Cross-lingual Embeddings for Low-Resource LanguagesDownload PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Recent research in cross-lingual representation learning has focused on offline mapping approaches due to their simplicity, computational efficacy, and ability to work with minimal parallel resources. However, they crucially depend on the assumption of embedding spaces being approximately isomorphic, which does not hold in practice, leading to poorer performance on low-resource and distant language pairs. In this paper, we introduce a framework to learn cross-lingual word embeddings, without assuming isometry, for low-resource pairs via joint exploitation of a related higher-resource language. Both the source and target monolingual embeddings are independently aligned to the related language, enabling the use of offline methods. We show that this approach successfully outperforms other methods on several low-resource language pairs in both bilingul lexicon induction as well as eigen value simialrity.
0 Replies
