A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems

Hannah Bast; Matthias Hertel; Natalie Prange

A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems

Hannah Bast, Matthias Hertel, Natalie Prange

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Resources and Evaluation

Submission Track 2: Information Extraction

Keywords: entity linking, entity linking evaluation, entity linking benchmarks

TL;DR: We provide a fair and in-depth evaluation of a variety of existing end-to-end entity linkers and create two new benchmarks to address problems in existing entity linking benchmarks and to provide a basis for fairer comparison.

Abstract: Existing evaluations of entity linking systems often say little about how the system is going to perform for a particular application. There are two fundamental reasons for this. One is that many evaluations only use aggregate measures (like precision, recall, and F1 score), without a detailed error analysis or a closer look at the results. The other is that all of the widely used benchmarks have strong biases and artifacts, in particular: a strong focus on named entities, an unclear or missing specification of what else counts as an entity mention, poor handling of ambiguities, and an over- or underrepresentation of certain kinds of entities. We provide a more meaningful and fair in-depth evaluation of a variety of existing end-to-end entity linkers. We characterize their strengths and weaknesses and also report on reproducibility aspects. The detailed results of our evaluation can be inspected under https://elevant.cs.uni-freiburg.de/emnlp2023. Our evaluation is based on several widely used benchmarks, which exhibit the problems mentioned above to various degrees, as well as on two new benchmarks, which address the problems mentioned above. The new benchmarks can be found under https://github.com/ad-freiburg/fair-entity-linking-benchmarks.

Submission Number: 3100

Loading