Tables2Traces: Distilling Tabular Data to Improve LLM Reasoning in Healthcare

Tables2Traces: Distilling Tabular Data to Improve LLM Reasoning in Healthcare

ICLR 2026 Conference Submission25257 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, tabular data, healthcare, medicine

TL;DR: We convert tabular clinical data into reasoning traces that improve LLM medical question answering across domains.

Abstract: Large language models (LLMs) excel at reasoning when fine-tuned on curated text corpora, but many domains, such as medicine, primarily store knowledge in structured tabular data. Despite its richness, tabular data has been largely overlooked as a source of reasoning supervision. Interpreting such data requires structured, relational reasoning across features and outcomes, not just surface-level pattern matching. In practice, this mirrors clinical decision making, where doctors often compare patients with similar characteristics and reason about why their outcomes diverge. We introduce Tables2Traces, the first framework to enable improved reasoning from raw tabular data by generating contrastive, case-based reasoning traces for model fine-tuning. This establishes a new supervision paradigm: converting tabular records, traditionally used only for prediction, into structured reasoning signals that can serve as an effective new source of supervision for LLMs. Crucially, this paradigm is orthogonal to text-based QA supervision: rather than competing with curated corpora, it unlocks an abundant and low-cost modality that complements existing approaches. Using only cardiovascular patient records, Tables2Traces yields relative gains of 17.2% on in-domain MedQA questions and 8.4% out-of-domain, improving accuracy in 15 of 17 clinical categories. On MedMCQA, it achieves a 7.2% relative improvement and outperforms the base model in 17 of 21 specialties. These gains are driven by a lightweight, domain-agnostic pipeline that elicits structured reasoning via contrastive and counterfactual prompts. Compared to training on narrative patient descriptions, Tables2Traces generalizes more effectively across question types and medical specialties, showing that even limited tabular data can serve as a scalable and complementary source of reasoning supervision for LLMs.

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 25257

Loading