Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure LearningOpen Website

Published: 01 Jan 2023, Last Modified: 29 Jun 2023DASFAA (4) 2023Readers: Everyone
Abstract: Automatic ICD coding aims at assigning the international classification of disease (ICD) codes to clinical notes documented by clinicians, which is crucial for saving human resources and has attracted much research attention in recent years. However, facing the challenges brought by the complex long textual narratives in clinical notes and the long-tailed data distribution in ICD codes, existing studies are ineffectual in the struggle to extract key information from the clinical notes and handle large amounts of small-data learning problems on the tail codes, which makes it hard to achieve satisfactory performance. In this paper, we present a ClinicalBERT-based model for automatic ICD coding, which can effectively cope with complex long clinical narratives via a segmentation learning mechanism and take advantage of the tree-like structure of ICD codes to transmit information among code nodes. Specifically, a novel hierarchical tree structure learning module is proposed to enable each code to utilize information both from upper and lower nodes of the tree, so that better code classifiers are learned for both head and tail codes. Experiments on MIMIC-III dataset show that our model outperforms current state-of-the-art (SOTA) ICD coding methods.
0 Replies

Loading