Abstract: Cross-lingual word embeddings are becoming
increasingly important in multilingual NLP.
Recently, it has been shown that these embeddings
can be effectively learned by aligning
two disjoint monolingual vector spaces
through linear transformations, using no more
than a small bilingual dictionary as supervision.
In this work, we propose to apply an additional
transformation after the initial alignment
step, which moves cross-lingual synonyms
towards a middle point between them.
By applying this transformation our aim is
to obtain a better cross-lingual integration of
the vector spaces. In addition, and perhaps
surprisingly, the monolingual spaces also improve
by this transformation. This is in contrast
to the original alignment, which is typically
learned such that the structure of the
monolingual spaces is preserved. Our experiments
confirm that the resulting cross-lingual
embeddings outperform state-of-the-art models
in both monolingual and cross-lingual evaluation
tasks.
0 Replies
Loading