Abstract: Nested entities are prone to obtain similar representations in pre-trained language models, posing challenges for Named Entity Recognition (NER), especially in the few-shot setting where prototype shifts often occur due to distribution differences between the support and query sets. In this paper, we regard entity representation as the combination of prototype and non-prototype representations. With a hypothesis that using the prototype representation specifically can help mitigate potential prototype shifts, we propose a Prototype-Attention mechanism in the Contrastive Learning framework (PACL) for the few-shot nested NER. PACL first generates prototype-enhanced span representations to mitigate the prototype shift by applying a prototype attention mechanism. It then adopts a novel prototype-span contrastive loss to reduce prototype differences further and overcome the O-type's non-unique prototype limitation by comparing prototype-enhanced span representations with prototypes and original semantic representations. Our experiments show that the PACL outperformed baseline models on the 1-shot and 5-shot tasks in terms of $F_1$ score. Furthermore, experiments on English datasets show the effectiveness of PACL, and experiments on cross-lingual datasets show the robustness of PACL. Further analyses indicate that our Prototype-Attention mechanism is a simple but effective method and exhibits good generalizability.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: NLP in resource-constrained settings
Contribution Types: Approaches to low-resource settings
Languages Studied: English, Chinese, German, Russian
Section 2 Permission To Publish Peer Reviewers Content Agreement: Authors grant permission for ACL to publish peer reviewers' content
Submission Number: 29
Loading