Abstract: The number of word embedding models is growing every year. Most of them learn word embeddings based on the co-occurrence information of words and their context. However, it's still an open question what is the best definition of context. We provide the first systematical investigation of different context types and representations for learning word embeddings. We conduct comprehensive experiments to evaluate their effectiveness under 4 tasks (21 datasets), which give us some insights about context selection. We hope that this paper, along with the published code, can serve as a guideline of choosing context for our community.
TL;DR: This paper investigate different context types and representations for learning word embeddings.
Conflicts: a.a, b.b
Keywords: Unsupervised Learning, Natural language processing
17 Replies
Loading