Intermediate Domain Finetuning for Weakly Supervised Domain-adaptive Clinical NER

Published: 01 Jan 2023, Last Modified: 27 Sept 2024BioNLP@ACL 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Accurate human-annotated data for real-worlduse cases can be scarce and expensive to obtain. In the clinical domain, obtaining such data is evenmore difficult due to privacy concerns which notonly restrict open access to quality data but also require that the annotation be done by domain experts. In this paper, we propose a novel framework - InterDAPT - that leverages Intermediate Domain Finetuning to allow language models to adapt to narrow domains with small, noisy datasets. By making use of peripherally-related, unlabeled datasets,this framework circumvents domain-specific datascarcity issues. Our results show that this weaklysupervised framework provides performance improvements in downstream clinical named entityrecognition tasks.
Loading