Learning Bilingual Lexicon for Low-Resource Language PairsOpen Website

2017 (modified: 15 Nov 2021)NLPCC 2017Readers: Everyone
Abstract: Learning bilingual lexicon from monolingual data is a novel idea in natural language process which can benefit many low-resource language pairs. In this paper, we present an approach for obtaining bilingual lexicon from monolingual data. Our method only requires a small seed bilingual lexicon and we use the Canonical Correlation Analysis to construct a shared latent space to explain two monolingual embeddings how to be linked. Experimental results show that a considerable precision and size bilingual lexicon can be learned in Chinese-Uyghur and Chinese-Kazakh monolingual data.
0 Replies

Loading