Towards Generating High-Quality Knowledge Graphs by Leveraging Large Language Models

Published: 01 Jan 2024, Last Modified: 04 Oct 2025NLDB (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Knowledge graph creation requires relation extraction (RE) tools often trained on annotated data either manually or by distant supervision. Recent approaches operate at the model level to handle new domains with unseen relations, relying on transfer learning or generative approaches in few/zero-shot learning scenarios. In this paper, we adopt a different strategy by operating instead at the level of dataset creation. We, for the first time to the best of our knowledge, investigate the ability of prompt-based models to build high-quality RE datasets relying on GPT4 to extract triples from sentences. Our approach is further enhanced by linking our knowledge graph to Wikidata, a step that enriches our dataset and ensures its interoperability. This strategy has been successfully employed in two use cases: COVID and health relation extraction.
Loading