Abstract: Word representation learning methods have mostly been designed for and evaluated on frequent words. However, in real-world settings, deep neural architectures are often expected to accept a large vocabulary of possible input words. In this paper, we investigate context-based techniques for few-shot learning of representations for infrequent words. We first adapt word2vec to account for expanded contexts and subsequently introduce an additional smoothing procedure. Experiments on similarity benchmarks show significant improvements for rare words.
0 Replies
Loading