TransKP: Transformer based Key-Phrase Extraction

Mukund Rungta, Rishabh Kumar, Mehak Preet Dhaliwal, Hemant Tiwari, Vanraj Vala

Published: 01 Jan 2020, Last Modified: 28 Jun 2023IJCNN 2020Readers: Everyone

Abstract: Increased connectivity has led to a sharp rise in the creation and availability of structured and unstructured text content, with millions of new documents being generated every minute. Key-phrase extraction is the process of finding the most important words and phrases which best capture the overall meaning and topics of a text document. Common techniques follow supervised or unsupervised methods for extractive or abstractive key-phrase extraction, but struggle to perform well and generalize to different datasets. In this paper, we follow a supervised, extractive approach and model the key-phrase extraction problem as a sequence labeling task. We utilize the power of transformers on sequential tasks and explore the effect of initializing the embedding layer of the model with pre-trained weights. We test our model on different standard key-phrase extraction datasets and our results significantly outperform all baselines as well as state-of-the-art scores on all the datasets.

0 Replies