Caveats in Generating Medical Imaging Labels from Radiology Reports with Natural Language ProcessingDownload PDF

12 Apr 2019 (modified: 02 Jul 2019)MIDL 2019 Conference Abstract SubmissionReaders: Everyone
  • Keywords: medical imaging, radiology reports, machine learning, NLP
  • TL;DR: In medical imaging, image and report labels differ due to existence of clinically non-actionable findings
  • Abstract: Acquiring high-quality annotations in medical imaging is usually a costly process. Automatic label extraction with natural language processing (NLP) has emerged as a promising workaround to bypass the need of expert annotation. Despite the convenience, the limitation of such an approximation has not been carefully examined and is not well understood. With a challenging set of 1,000 chest X-ray studies and their corresponding radiology reports, we show that there exists a surprisingly large discrepancy between what radiologists visually perceive and what they clinically report. Furthermore, with inherently flawed report as ground truth, the state-of-the-art medical NLP fails to produce high-fidelity labels.
  • Code Of Conduct: I have read and accept the code of conduct.
3 Replies