Multi-Stage Framework with Refinement based Point Set Registration for Unsupervised Bi-Lingual Word AlignmentDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages, for improving machine translation and other multi-lingual applications. Current unsupervised approaches rely on learning structure preserving linear transformations using adversarial networks and refinement strategies. However, such techniques, tend to suffer from instability and convergence issues, requiring tedious fine-tuning of parameter setting. This paper proposes BioSpere, a novel multi-stage framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space, by combining adversarial initialization, refinement procedure and point set registration algorithm. We show that our framework alleviates the above shortcomings, and is robust against variable adversarial learning performance and parameter choices. Experiments for parallel dictionary induction, sentence translation and word similarity demonstrate state-of-the-art results for BioSpere on diverse language pairs.
0 Replies

Loading