Abstract: Sequence labeling remains a significant challenge in low-resource, domain-specific scenarios, particularly for character-dense languages. Existing methods primarily focus on enhancing model comprehension and improving data diversity to boost performance. However, these approaches still struggle with inadequate model applicability and semantic distribution biases in domain-specific contexts. To overcome these limitations, we propose a novel framework that combines an LLM-based knowledge enhancement workflow with a span-based Knowledge Fusion for Rich and Efficient Extraction (KnowFREE) model. Our workflow employs explanation prompts to generate precise contextual interpretations of target entities, effectively mitigating semantic biases and enriching the model's contextual understanding. The KnowFREE model further integrates extension label features, enabling efficient nested entity extraction without relying on external knowledge during inference. Experiments on multiple domain-specific sequence labeling datasets demonstrate that our approach achieves state-of-the-art performance, effectively addressing the challenges posed by low-resource settings.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: zero/few-shot extraction, data augmentation, named entity recognition
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: Chinese, English
Keywords: zero/few-shot extraction, data augmentation, named entity recognition
Submission Number: 544
Loading