Multi-Task Multi-Attention Transformer for Generative Named Entity Recognition

Published: 01 Jan 2024, Last Modified: 19 May 2025IEEE ACM Trans. Audio Speech Lang. Process. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most previous sequential labeling models are task-specific, while recent years have witnessed the rise of generative models due to the advantage of unifying all named entity recognition (NER) tasks into the encoder-decoder framework. Although achieving promising performance, our pilot studies demonstrate that existing generative models are ineffective at detecting entity boundaries and estimating entity types. In this paper, we propose a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task. More concretely, we achieve entity boundary detection by classifying the relations between tokens within the sentence. To improve the accuracy of entity-type mapping during decoding, we adopt an external knowledge base to calculate the prior entity-type distributions and then incorporate the information into the model via the self- and cross-attention mechanisms. We perform experiments on extensive NER benchmarks, including flat, nested, and discontinuous NER datasets involving long entities. It substantially increases nearly $+0.3 \sim +1.5\;{F_1}$ scores across a broad spectrum or performs closely to the best generative NER model. Experimental results show that our approach improves the performance of the generative NER model considerably.
Loading