Abstract: Breast ultrasound plays a pivotal role in detecting and diagnosing breast abnormalities. Radiology reports summarize key findings from these examinations, highlighting lesion characteristics and malignancy assessments. However, extracting this critical information is challenging due to the unstructured nature of radiology reports, which often exhibit varied linguistic styles and inconsistent formatting. While proprietary LLMs like GPT-4 effectively retrieve information, they are costly and raise privacy concerns when handling protected health information. This study presents a pipeline for developing an in-house LLM to extract clinical information from these reports. We first utilize GPT-4 to create a small subset of labeled data, then fine-tune a Llama3-8B using this dataset. Evaluated on a subset of reports annotated by clinicians, the proposed model achieves an average F1 score of 84.6%, which is on par with GPT-4. Our findings demonstrate that it is feasible to develop an in-house LLM that not only matches the performance of GPT-4 but also offers cost reductions and enhanced data privacy.
Loading