Distantly supervised end-to-end medical entity extraction from electronic health records with human-level qualityDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: entity extraction, medical entity extraction, named entity recognition, named entity normalization, electronic health records, unsupervised learning, distant supervision.
Abstract: Medical entity extraction (EE) is a standard procedure used as a first stage inmedical texts processing. Usually Medical EE is a two-step process: named entityrecognition (NER) and named entity normalization (NEN). We propose a novelmethod of doing medical EE from electronic health records (EHR) as a single-step multi-label classification task by fine-tuning a transformer model pretrainedon a large EHR dataset. Our model is trained end-to-end in an distantly supervisedmanner using targets automatically extracted from medical knowledge base. Weshow that our model learns to generalize for entities that are present frequentlyenough, achieving human-level classification quality for most frequent entities.Our work demonstrates that medical entity extraction can be done end-to-endwithout human supervision and with human quality given the availability of alarge enough amount of unlabeled EHR and a medical knowledge base.
One-sentence Summary: We propose a method of medical entity extraction as a single-step multi-label classification task with distant supervision labels and achieve human-level extraction quality for most frequent entities.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=nA_UkZdg_R
5 Replies
