From Structured Data to Clinical Notes: Robust Clinical Decision Support with Fine-Tuned LLMs

Frederike Lübeck; Jonas Bernhard Wildberger; Frederik Träuble; Maximilian Mordig; Sergios Gatidis; Andreas Krause; Bernhard Schölkopf

From Structured Data to Clinical Notes: Robust Clinical Decision Support with Fine-Tuned LLMs

Frederike Lübeck, Jonas Bernhard Wildberger, Frederik Träuble, Maximilian Mordig, Sergios Gatidis, Andreas Krause, Bernhard Schölkopf

Published: 09 Jun 2025, Last Modified: 03 Jul 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Structured Data, Clinical Notes, Clinical Decision Support

TL;DR: This paper shows that large language models fine-tuned on structured clinical data can accurately predict cardiovascular disease risk and generalize well to unstructured clinical notes at inference time.

Abstract: Clinical machine learning models are typically trained on highly structured and consistent datasets but deployed in real-world settings dominated by unstructured clinical text, creating a fundamental challenge for practical adoption. In this work, we investigate whether large language models (LLMs), fine-tuned on structured patient data, can generalize effectively to unstructured clinical notes at inference time. Using the UK Biobank dataset for cardiovascular disease (CVD) risk prediction, we demonstrate that LLMs trained on structured representations achieve performance comparable to specialized tabular machine learning models. More importantly, we show that these models maintain strong predictive accuracy when applied to unstructured inputs, such as clinical notes, in both zero-shot and few-shot scenarios.

Submission Number: 84

Loading