Wiki Entity Summarization Benchmark

Atefeh Moradan; Saeedeh Javadi; Mohammad Sorkhpar; Klim Zaporojets; Davide Mottin; Ira Assent

Wiki Entity Summarization Benchmark

Atefeh Moradan, Saeedeh Javadi, Mohammad Sorkhpar, Klim Zaporojets, Davide Mottin, Ira Assent

27 Sept 2024 (modified: 28 Apr 2025)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Entity summarization, knowledge graph benchmark

TL;DR: We propose a novel entity summarization benchmark with dataset generation, automatically generated ground-truth, and access to the graph structure.

Abstract: Entity summarization aims to compute concise summaries for entities in knowledge graphs. However, current datasets and benchmarks are often limited to only a few hundred entities and overlook knowledge graph structure. This is particularly evident in the scarcity of ground-truth summaries, with few labeled entities available for evaluation and training. We propose WIKES (Wiki Entity Summarization Benchmark), a large benchmark comprising of entities, their summaries, and their connections. Additionally, WIKES features a dataset generator to test entity summarization algorithms in different subgraphs of the knowledge graph. Importantly, our approach combines graph algorithms and NLP models, as well as different data sources such that WIKES does not require human annotation, rendering the approach cost-effective and generalizable to multiple domains. Finally, WIKES is scalable and capable of capturing the complexities of knowledge graphs in terms of topology and semantics. WIKES features existing datasets for comparison. Empirical studies of entity summarization methods confirm the usefulness of our benchmark. Data, code, and models are available at: https://anonymous.4open.science/r/Wikes-2DDA/README.md

Supplementary Material: pdf

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9772

Loading