Abstract: The development of robust deep learning systems for radiology requires large annotated datasets, which are costly and time-consuming to produce manually. Recent advances in large language models (LLMs) suggest these models could serve as automated annotators for radiological studies. However, deploying LLMs as surrogates for human annotators raises concerns about scalability, data quality, and privacy. Additionally, the interpretability of annotations from black-box LLMs remains limited without downstream validation.
Loading