Abstract: This work exploits translation data as a source
of semantically relevant learning signal for
models of word representation. In particular,
we exploit equivalence through translation as
a form of distributed context and jointly learn
how to embed and align with a deep generative model. Our EMBEDALIGN model embeds words in their complete observed context
and learns by marginalisation of latent lexical
alignments. Besides, it embeds words as posterior probability densities, rather than point
estimates, which allows us to compare words
in context using a measure of overlap between
distributions (e.g. KL divergence). We investigate our model’s performance on a range of
lexical semantics tasks achieving competitive
results on several standard benchmarks including natural language inference, paraphrasing,
and text similarity.
0 Replies
Loading