RAG-ED: Retrieval-Augmented Generation for Entity Disambiguation

RAG-ED: Retrieval-Augmented Generation for Entity Disambiguation

ACL ARR 2025 February Submission1525 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Entity Disambiguation (ED) resolves ambiguous mentions in text by linking them to entities in a knowledge base. A key challenge in ED is entity overshadowing, where dominant entities obscure the correct choice. We propose RAG-ED (Retrieval-Augmented Generation for Entity Disambiguation), a data efficient three-stage pipeline consisting of a lightweight retriever, reranker, and a strong large language model based selector. RAG-ED achieves state-of-the-art performance on entity overshadowing cases, outperforming prior methods by 17 points. Additionally, the pipeline can also maintain competitive performance across standard ED benchmarks, demonstrating its broad applicability. A key advantage of RAG-ED is its ability to identify instances where disambiguation should not be performed, which is particularly useful in settings relying on lightweight retrievers. We conduct extensive analyses and ablation studies on diverse ED datasets further highlighting the effectiveness of our approach.

Paper Type: Long

Research Area: Information Extraction

Research Area Keywords: entity disambiguation, entity overshadowing, retrieval augmented generation, large language models

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data analysis

Languages Studied: English

Submission Number: 1525

Loading