Entity Disambiguation on a Tight Labeling Budget

Audi Primadhanty; Ariadna Quattoni

Entity Disambiguation on a Tight Labeling Budget

Audi Primadhanty, Ariadna Quattoni

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Short Paper

Submission Track: Machine Learning for NLP

Keywords: entity linking, learning under a budget, tensor bilinear model

TL;DR: Effective strategy to learn an entity disambiguation model with small annotation budget, using feature diversity and low rank correction.

Abstract: Many real-world NLP applications face the challenge of training an entity disambiguation model for a specific domain with a small labeling budget. In this setting there is often access to a large unlabeled pool of documents. It is then natural to ask the question: which samples should be selected for annotation? In this paper we propose a solution that combines feature diversity with low rank correction. Our sampling strategy is formulated in the context of bilinear tensor models. Our experiments show that the proposed approach can significantly reduce the amount of labeled data necessary to achieve a given performance.

Submission Number: 1812

Loading