MULTPAX: Keyphrase Extraction Using Language Models and Knowledge GraphsDownload PDF

02 Feb 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: Keyphrase extraction aims to identify a small set of phrases that best describe the content of text. The automatic generation of keyphrases has become essential for many natural language applications such as text categorization, indexing, and summarization. In this paper, we propose MULTPAX, a multitask framework for extracting present and absent keyphrases using pre-trained language models and knowledge graphs. In particular, our framework contains three components: first, MULTPAX identifies present keyphrases from an input document. Then, MULTPAX links with external knowledge graphs to get more relevant phrases. Finally, MULTPAX ranks the extracted phrases based on their semantic relatedness to the input document and return top-k phrases as a final output. We conducted several experiments on four benchmark datasets to evaluate the performance of MULTPAX against different state-of-the-art baselines. The evaluation results demonstrate that our approach significantly outperforms the state-of-the-art baselines, with a significance t-test 𝑝<0.041 . Our source code and datasets are public available at https://github.com/dice-group/MultPAX.
0 Replies

Loading