Context-Aware Filtering of Unstructured Radiology Reports by Anatomical Region

Zakk Heile, Pranav Manjunath, Brian Lerner, Samuel Berchuck, Monica Agrawal, Timothy W DUNN

Published: 27 Nov 2025, Last Modified: 09 Dec 2025ML4H 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: radiology reports, information extraction, anatomical region, clinical NLP, report filtering
TL;DR: We develop interpretable sequence models that filter unstructured or bundled radiology reports by anatomical region, outperforming neural networks and pre-trained language models while generalizing across institutions.
Track: Proceedings
Abstract: Radiology reports contain essential clinical information but often remain in unstructured, free-text formats. Notably, multiple imaging examinations performed simultaneously (such as CT head, facial bones, and cervical spine in trauma cases) may be bundled into a single report that consolidates findings from all studies into one free-text document, written jointly. Because individual sentences may reference ambiguous or overlapping anatomy (e.g., “there is a fracture”), sentence-level anatomic classification—filtering a report to retain only findings relevant to a specific anatomical region—is essential for downstream tasks such as structured label extraction and for creating clean, bijective training data for radiology report generation models. While formatting differs across reports, the clinical language remains precise. Using that fact, we develop context-aware classical models with feature engineering that surpass trained neural networks and pre-trained language models. We show that the learned model weights generalize effectively to MIMIC-IV radiology reports and that our approach achieves near-optimal performance with only a small amount of labeled training data. Together, these results make our approach practical and reproducible for new settings.
General Area: Applications and Practice
Specific Subject Areas: Explainability & Interpretability, Natural Language Processing
Data And Code Availability: No
Ethics Board Approval: Yes
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Submission Number: 112
Loading