LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Information extraction, uncertainty estimation, robustness, large language models
TL;DR: Fine-tuned NER models have limited knowledge of unseen entities, while LLMs lack specialty in the NER tasks. We propose LinkNER, which combines small fine-tuned models with LLMs, to achieve superior performance and enhance robustness on NER tasks.
Abstract: Named Entity Recognition (NER) serves as a fundamental task in natural language understanding, bearing direct implications for web content analysis, search engines, and information retrieval systems. Fine-tuned NER models exhibit satisfactory performance on standard NER benchmarks. However, due to limited fine-tuning data and lack of knowledge, it performs poorly on unseen entity recognition. As a result, the usability and reliability of NER models in web-related applications are compromised. Instead, Large Language Models (LLMs) like GPT-4 possess extensive external knowledge, but research indicates that they lack specialty for NER tasks. Furthermore, non-public and large-scale weights make tuning LLMs difficult. To address these challenges, we propose a framework that combines small fine-tuned models with LLMs (LinkNER) and an uncertainty-based linking strategy called RDC that enables fine-tuned models to complement black-box LLMs, achieving better performance. We conduct experiments on standard NER test sets as well as noisy social media datasets. We find that LinkNER can improve performance on NER tasks, especially outperforming SOTA models in challenging robustness tests (with a 3.04\% $\sim$ 21.30\% improvement in the F1 score). Additionally, we conduct a quantitative study to examine the impact of key components, such as uncertainty estimation methods, LLMs, and in-context learning, on various NER tasks and provide targeted web-related recommendations.
Track: Web Mining and Content Analysis
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 563
Loading