Keywords: Entity summarization, knowledge graph benchmark
TL;DR: We propose a novel entity summarization benchmark with dataset generation, automatically generated ground-truth, and access to the graph structure.
Abstract: Entity summarization aims to compute concise summaries for entities in knowledge graphs.
However, current datasets and benchmarks are often limited to only a few hundred entities
and overlook knowledge graph structure. This is particularly evident in the scarcity of
ground-truth summaries, with few labeled entities available for evaluation and training. We
propose WIKES (Wiki Entity Summarization Benchmark), a large benchmark comprising
of entities, their summaries, and their connections. Additionally, WIKES features a
dataset generator to test entity summarization algorithms in different subgraphs of the
knowledge graph. Importantly, our approach combines graph algorithms and NLP models,
as well as different data sources such that WIKES does not require human annotation,
rendering the approach cost-effective and generalizable to multiple domains. Finally,
WIKES is scalable and capable of capturing the complexities of knowledge graphs in
terms of topology and semantics. WIKES features existing datasets for comparison.
Empirical studies of entity summarization methods confirm the usefulness of our benchmark.
Data, code, and models are available at: https://anonymous.4open.science/r/Wikes-2DDA/README.md
Supplementary Material: pdf
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9772
Loading