LLMRank: Enhancing Large Language Models for Unsupervised Keyphrase Extraction with a Candidate Graph Approach

LLMRank: Enhancing Large Language Models for Unsupervised Keyphrase Extraction with a Candidate Graph Approach

ACL ARR 2024 June Submission2924 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Keyphrase extraction is a crucial NLP task that extracts essential information from extensive texts, aiding in content summarization and browsing. This paper introduces LLMRank, a novel unsupervised keyphrase extraction method that enhances Large Language Models (LLMs) with a graph-based approach. Initially, LLMs generate a wide array of candidate keyphrases, which are then represented as nodes in a custom graph. Edges between these nodes are established based on the co-occurrence of candidates within the content, enhancing keyphrase ranking through structured contextual information. We evaluated LLMRank using three state-of-the-art LLMs across four publicly available datasets, comparing its performance against seventeen baseline models. The results demonstrate that LLMRank effectively extracts keyphrases from long and complex documents in an unsupervised manner. The source code is available on GitHub.

Paper Type: Long

Research Area: Information Extraction

Research Area Keywords: open information extraction; document-level extraction; zero/few-shot extraction;

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data analysis

Languages Studied: English

Submission Number: 2924

Loading