TempEL: Linking Dynamically Evolving and Newly Emerging EntitiesDownload PDF

Published: 17 Sept 2022, Last Modified: 22 Oct 2023NeurIPS 2022 Datasets and Benchmarks Readers: Everyone
Keywords: Entity Linking, Entity Disambiguation, Information Extraction, Temporal Data Evolution
Abstract: In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study how this evolutionary scenario impacts the performance on a well established entity linking (EL) task. For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect both anchor mentions of entities, and these target entities’ descriptions. By capturing such temporal aspects, our newly introduced TempEL resource contrasts with currently existing entity linking datasets, which are composed of fixed mentions linked to a single static version of a target Knowledge Base (e.g., Wikipedia 2010 for CoNLL-AIDA). Indeed, for each of our collected temporal snapshots, TempEL contains links to entities that are continual, i.e., occur in all of the years, as well as completely new entities that appear for the first time at some point. Thus, we enable to quantify the performance of current state-of-the-art EL models for: (i) entities that are subject to changes over time in their Knowledge Base descriptions as well as their mentions’ contexts, and (ii) newly created entities that were previously non-existing (e.g., at the time the EL model was trained). Our experimental results show that in terms of temporal performance degradation, (i) continual entities suffer a decrease of up to 3.1% EL accuracy, while (ii) for new entities this accuracy drop is up to 17.9%. This highlights the challenge of the introduced TempEL dataset and opens new research prospects in the area of time-evolving entity disambiguation.
Author Statement: Yes
Supplementary Material: pdf
Dataset Url: https://cloud.ilabt.imec.be/index.php/s/RinXy8NgqdW58RW
License: Attribution-ShareAlike 4.0 International license (CC BY-SA 4.0)
Contribution Process Agreement: Yes
In Person Attendance: Yes
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/arxiv:2302.02500/code)
17 Replies

Loading