Cross-Interaction of Chinese Characters Structures and Boundary Features for Improving Clinical Named Entity Recognition

Published: 2025, Last Modified: 21 Jan 2026IEEE J. Biomed. Health Informatics 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the natural language processing task of clinical named entity recognition (CNER), accurately identifying the boundaries and categories of medical entities is crucial. However, traditional methods struggle to recognize a large number of clinical terms and symbols that have never been encountered before, ultimately limiting the performance of CNER. Besides, there exist some easy-to-confuse Chinese clinical entities that are semantically similar but belong to quite different categories, such as “肺结节” (pulmonary nodules, a symptom entity) and “肺结核” (pulmonary tuberculosis, a disease entity), which can lead to entity misidentification. To address these problems, we propose a novel NER model called Cross-Interaction of Chinese characters structures and Boundary Features (CCS). The proposed model leverages Chinese character structural features and boundary information to comprehensively and accurately identify confusing entities. We further design a Cross-Attention mechanism to capture dependency relationships between different entities and radicals of characters, enhancing the model's semantic understanding of specialized terms and symbols, as well as improving its ability to recognize boundaries. Our experimental results show that our proposed model outperforms other state-of-the-art models on various public medical datasets, achieving significant improvements on the CCKS2020, CMeEE, CMI, and IMCS datasets, respectively.
Loading