NoTNER: self-optimizing text reconstruction for open named entity recognition on social media

Published: 01 Jan 2025, Last Modified: 08 Oct 2025J. Supercomput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Large language models (LLMs) bring strong reasoning and understanding capabilities to various NLP tasks. However, recent studies show that existing LLMs still struggle with information extraction tasks, especially in noisy and low-resource settings. To address these challenges, we propose NoTNER, a novel framework for open-domain named entity recognition (NER) on social media texts without any fine-tuning. NoTNER integrates two key components: (1) a self-optimizing text reconstruction module based on Monte Carlo tree search to clean informal inputs through prompt optimization and (2) a zero-shot chain-of-thought reasoning template to guide entity extraction step by step. Extensive experiments on two benchmark datasets demonstrate that NoTNER achieves superior zero-shot performance compared to both fine-tuned and prompting-based baselines. Specifically, it improves F1 scores by 5–25 points over strong fine-tuned and zero-shot baselines on Tweebank-NER v1.0, and obtains competitive results on WNUT17. The framework also shows strong generalization across multiple LLMs, including ChatGPT, LLaMA2, and Yi-34B, highlighting its robustness and deployment efficiency in real-world noisy environments.
Loading