Boosting LLMS with Ontology-Aware Prompt for Ner Data Augmentation

Published: 01 Jan 2024, Last Modified: 16 May 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Named Entity Recognition (NER) data augmentation (DA) aims to improve the performance and generalization capabilities of NER models by generating scalable training data. The key challenge lies in ensuring the generated samples maintain contextual diversity while preserving label consistency. However, existing dominant methods fail to simultaneously satisfy both criteria. Inspired by the extensive generative capabilities of large language models (LLMs), we propose ANGEL, a frAmework integrating the oNtoloGy structure and instructivE prompting within LLMs. Specifically, the hierarchical ontology structure guides prompt ranking, while instructive prompting enhances LLMs’ mastery of domain knowledge, empowering synthetic sample generation and annotation. Experiments show ANGEL surpasses state-of-the-art (SOTA) baselines, conferring absolute F1 increases of 2.86% and 0.93% on two benchmark datasets, respectively.
Loading