Abstract: The Flat-LAttice Transformer (FLAT) has achieved notable success in Chinese named entity recognition (NER) by integrating lexical information into the widely-used Transformer encoder. FLAT enhances each sentence by constructing a flat lattice, a token sequence with characters and matched lexicon words, and calculating self-attention among tokens. However, FLAT faces a quadruple complexity challenge, especially with lengthy sentences containing numerous matched words, significantly increasing memory and computational costs. To alleviate this issue, we propose a novel lightweight lexicon-enhanced Transformer (LLET) for Chinese NER. Specifically, we introduce two distinct variants that focus on character attention to characters and words, both jointly and separately. Experimental results conducted on four public Chinese NER datasets show that both variants achieve significant memory savings while maintaining comparable performance when compared to FLAT.
Loading