Abstract: Word embeddings are increasingly usedin natural language understanding tasksrequiring sophisticated semantic informa-tion. However, the quality of new embed-ding methods is usually evaluated basedon simple word similarity benchmarks.We propose evaluating word embeddingsin vivoby evaluating them on a suite ofpopular downstream tasks. To ensure theease of use of the evaluation, we take careto find a good point in the tradeoff spacebetween (1) creating a thorough evalua-tion – i.e., we evaluate on a diverse setof tasks; and (2) ensuring an easy and fastevaluation – by using simple models withfew tuned hyperparameters. This allowsus to release this evaluation as a standard-ized script and online evaluation, availableathttp://veceval.com/
0 Replies
Loading