LEAVS: An LLM-Based Labeler for Abdominal CT Supervision

Ricardo Bigolin Lanfredi, Yan Zhuang, Mark Finkelstein, Praveen Thoppey Srinivasan Balamuralikrishna, Luke Krembs, Brandon Khoury, Arthi Reddy, Pritam Mukherjee, Neil M. Rofsky, Ronald M. Summers

Published: 2025, Last Modified: 07 Dec 2025MICCAI (5) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Extracting structured labels from radiology reports has been employed to create vision models that detect several types of abnormalities simultaneously. However, existing works focus mainly on the chest region. Few works have investigated abdominal radiology reports due to the more complex anatomy and a wider range of pathologies in the abdomen. We propose LEAVS (Large language model Extractor for Abdominal Vision Supervision). This labeler can annotate the certainty of presence and the urgency of seven types of abnormalities for nine abdominal organs on CT radiology reports. To ensure broad coverage, we chose abnormalities that encompass most of the finding types from CT reports. Our approach employs a specialized chain-of-thought prompting strategy for a locally run LLM using sentence extraction and multiple-choice questions in a tree-based decision system. We demonstrate that the LLM can extract several abnormality types across abdominal organs with an average F1 score of 0.89, significantly outperforming competing labelers and humans. Additionally, we show that the extraction of urgency labels achieves performance comparable to that of human annotations. Finally, we demonstrate that the abnormality labels contain valuable information for training a vision model that classifies several organs as normal or abnormal. We release our code and structured annotations for a publicly available dataset containing over 1,000 CT volumes.

External IDs:dblp:conf/miccai/LanfrediZFBKKRMRS25