LANTNER - Large Language Models for Novel Tasks Annotation in Named Entity Recognition

Published: 2025, Last Modified: 13 Jan 2026COINS 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Adapting Named Entity Recognition (NER) to specialised domains is a critical yet challenging task, often hampered by the prohibitive cost of manual annotation. To address this, we propose LANTNER, a framework that distils the knowledge of powerful LLMs into efficient, domain-specific NER models. Our approach synergistically combines automated data generation with lightweight model fine-tuning. The LANTNER framework comprises three stages: unsupervised preprocessing of raw web text, high-quality, automated annotation via an LLM guided by the DSPy framework, and efficient fine-tuning of a compact GLiNER model on the generated dataset. Evaluations on web-based datasets reveal that LANTNER achieves a peak F1-score of 0.69 and a precision of 0.70, significantly outperforming both zero-shot GLiNER and direct inference with a state-of-the-art LLM. Furthermore, our analysis demonstrates the high data efficiency of our framework, with performance gains saturating around 3,000 training instances, suggesting that near-optimal results are attainable with a modest annotation budget. By strategically shifting the computational cost of LLMs to the offline training phase, LANTNER offers a practical and scalable pathway for developing high-precision, deployable NER models for specialised domains.
Loading