Multi-Stage Framework with Refinement based Point Set Registration for Unsupervised Bi-Lingual Word Alignment
Abstract: Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages, for improving machine translation and other multi-lingual applications. Current unsupervised approaches rely on learning structure preserving linear transformations using adversarial networks and refinement strategies. However, such techniques, tend to suffer from instability and convergence issues, requiring tedious fine-tuning of parameter setting. This paper proposes BioSpere, a novel multi-stage framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space, by combining adversarial initialization, refinement procedure and point set registration algorithm. We show that our framework alleviates the above shortcomings, and is robust against variable adversarial learning performance and parameter choices. Experiments for parallel dictionary induction, sentence translation and word similarity demonstrate state-of-the-art results for BioSpere on diverse language pairs.
0 Replies
Loading