Abstract: Named entity recognition (NER) acts as a fundamental task in natural language processing. However, its robustness is currently barely studied. This paper finds that the conventional text attack for sentence classification can result in label mutation for NER, due to the naturally finer granularity of named entity ground truth. We therefore define a new style of text attack, \textit{virtual attack}. \textit{Virtual} indicates that the attack does not rely on the ground truth but the model prediction. On top of that, we propose a novel fast NER attacker, where we try to insert a ``virtual boundary'' into the text. It turns out the current strong language models (e.g. RoBERTa, DeBERTa) suffer from a high preference to wrongly recognize those virtual boundaries as entities. Our attack is shown to be effective on both English and Chinese, achieving a 70\%-90\% attack success rate, and is 50 times faster than the previous methods.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
0 Replies
Loading