EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge

ICLR 2026 Conference Submission18427 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: knowledge base construction, named entity recognition and relation extraction, entity linking/disambiguation
TL;DR: A new dataset aligning KG updates with emerging textual knowledge, introducing operations for updating KGs from evolving textual sources.
Abstract: Knowledge Graphs (KGs) are structured knowledge repositories containing entities and relations between them. In this paper, we investigate the problem of automatically updating KGs over time with respect to the evolution of knowledge in unstructured textual sources. This problem requires identifying a wide range of update operations based on the state of an existing KG at a specific point in time. This contrasts with traditional information extraction pipelines, which extract knowledge from text independently of the current state of a KG. To address this challenge, we propose a method for construction of a dataset consisting of Wikidata KG snapshots over time and Wikipedia passages paired with the corresponding edit operations that they induce in a particular KG snapshot. The resulting dataset comprises 233K Wikipedia passages aligned with a total of 1.45 million KG edits over 7 different yearly snapshots of Wikidata from 2019 to 2025. Our experimental results highlight challenges in updating KG snapshots based on emerging textual knowledge, particularly the integration of knowledge between text and KGs, positioning the dataset as a valuable benchmark for future research. We will publicly release our dataset and model implementations.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Submission Number: 18427
Loading