A New Learning-to-Rank Framework for Keyphrase Extraction Using Multi-scale Ratings and Feature Fusion

Corina Florescu, Avijeet Shil, Wei Jin

Published: 01 Jan 2024, Last Modified: 30 Jul 2025APWeb/WAIM (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Previous work has mainly framed keyphrase extraction (KE) as a binary classification task where candidate phrases are predicted as either keyphrases or non-keyphrases. However, in reality, the boundary between them is somewhat hard to define according to a binary judgment, even for human annotators. Therefore, a finer measurement of appropriateness may be desired for this task, leading to our new idea of incorporating the degree to which a phrase represents the main topics of a document into the learning and ranking process. In this paper, we propose ppKE, a first supervised ranking model for keyphrase extraction that incorporates phrase importance information. A comprehensive feature study and evaluation are also conducted. Our model obtains remarkable improvements in performance over ranking models that do not take phrase relevance into account, as well as over strong previous approaches for this task.