Wiki Entity Summarization Benchmark

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Entity summarization, knowledge graph benchmark
TL;DR: We propose a novel entity summarization benchmark with dataset generation, automatically generated ground-truth, and access to the graph structure.
Abstract: Entity summarization aims to compute concise summaries for entities in knowledge graphs. However, current datasets and benchmarks are often limited to only a few hundred entities and overlook knowledge graph structure. This is particularly evident in the scarcity of ground-truth summaries, with few labeled entities available for evaluation and training. We propose WIKES (Wiki Entity Summarization Benchmark), a large benchmark comprising of entities, their summaries, and their connections. Additionally, WIKES features a dataset generator to test entity summarization algorithms in different subgraphs of the knowledge graph. Importantly, our approach combines graph algorithms and NLP models, as well as different data sources such that WIKES does not require human annotation, rendering the approach cost-effective and generalizable to multiple domains. Finally, WIKES is scalable and capable of capturing the complexities of knowledge graphs in terms of topology and semantics. WIKES features existing datasets for comparison. Empirical studies of entity summarization methods confirm the usefulness of our benchmark. Data, code, and models are available at: https://anonymous.4open.science/r/Wikes-2DDA/README.md
Supplementary Material: pdf
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9772
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview