Query, Don’t Train: Privacy-Preserving Tabular Prediction from EHR Data via SQL Queries

Josefa Lia Stoisser; Marc Boubnovski Martell; Kaspar Märtens; Lawrence Phillips; Stephen Michael Town; Rory Donovan-Maiye; Julien Fauqueur

Query, Don’t Train: Privacy-Preserving Tabular Prediction from EHR Data via SQL Queries

Josefa Lia Stoisser, Marc Boubnovski Martell, Kaspar Märtens, Lawrence Phillips, Stephen Michael Town, Rory Donovan-Maiye, Julien Fauqueur

Published: 09 Jun 2025, Last Modified: 02 Jul 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLMs, EHR, SQL, Inference Tabular Prediction

TL;DR: A framework that leverages large language models for zero-shot tabular prediction in electronic health records, achieving predictive performance while maintaining privacy compliance.

Abstract: Electronic health records (EHRs) contain richly structured, longitudinal data essential for predictive modeling, yet stringent privacy regulations (e.g., HIPAA, GDPR) often restrict access to individual-level records. We introduce \textbf{Query, Don’t Train} (QDT): a \textbf{structured-data foundation-model interface} enabling \textbf{tabular inference} via LLM-generated SQL over EHRs. Instead of training on or accessing individual-level examples, QDT uses a large language model (LLM) as a schema-aware query planner to generate privacy-compliant SQL queries from a natural language task description and a test-time input. The model then extracts summary-level population statistics through these SQL queries, and the LLM performs chain-of-thought reasoning over the results to make predictions. This inference-time–only approach enables prediction without supervised model training, ensures interpretability through symbolic, auditable queries, naturally handles missing features without imputation or preprocessing, and effectively manages high-dimensional numerical data to enhance analytical capabilities. We validate QDT on the task of 30-day hospital readmission prediction for Type 2 diabetes patients using a MIMIC-style EHR cohort, achieving F1 = 0.70, which outperforms TabPFN (F1 = 0.68). To our knowledge, this is the first demonstration of LLM-driven, privacy-preserving structured prediction using only schema metadata and aggregate statistics—offering a scalable, interpretable, and regulation-compliant alternative to conventional foundation-model pipelines.

Submission Number: 85

Loading