Waste Not: Meta-Embedding of Word and Context Vectors

Selin Degirmenci, Aydin Gerek, Murat Can Ganiz

Published: 2019, Last Modified: 30 Jun 2023NLDB 2019Readers: Everyone

Abstract: The word2vec and fastText models train two vectors per word: a word and a context vector. Typically the context vectors are discarded after training, even though they may contain useful information for different NLP tasks. Therefore we combine word and context vectors in the framework of meta-embeddings. Our experiments show performance increases at several NLP tasks such as text classification, semantic similarity, and analogy. In conclusion, this approach can be used to increase performance at downstream tasks while requiring minimal additional computational resources.

0 Replies