Abstract: Keyphrase extraction aims at automatically selecting small set of phrases in a document, that best describe its main ideas. There is great need for better methods of keyphrase extraction in the absence of labeled data, as currently unsupervised algorithms fail to achieve adequate performance, compared to their supervised counterparts. In this paper we suggest a widely applicable distant supervision framework based on auxiliary data from query logs. By propagating information from queries and subsequent consumption of content, weak labels are produced, transforming the problem into the easier supervised task. Evaluation on a large dataset shows the superiority of this approach over unsupervised alternatives.
0 Replies
Loading