Abstract: We explore the boundary attention models for character-level Chinese NER. We test the standard transformer model, as well as a novel variant in which the encoder block combines information from the nearby global attention of characters using convolutions. The convolutions are activated by gate to represent boundaries. The boundaries are added into encode in forward to produce entity boundaries based on input sequence. We perform extensive experiments on four Chinese NER datasets. Our transformer variant consistently outperforms the standard transformer at the character-level and converges faster while learning more robust character-level alignments.
Paper Type: long
Research Area: Information Extraction
0 Replies
Loading