Test-Time Scaling in Clinical Decision Making

Published: 14 Feb 2026, Last Modified: 16 Apr 2026MIDL 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Vision Language Model, Test-Time Scaling, Reasoning, Medical Imaging and Diagnosis
Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in complex reasoning and knowledge-intensive tasks, yet their potential for clinical decision making through test-time scaling (TTS) remains largely unexplored. While TTS has shown promise in improving reasoning performance by leveraging additional inference-time computation, its effectiveness in the medical domain has not been systematically investigated. This gap is further exacerbated by the impracticality of supervised fine-tuning for clinical reasoning tasks, owing to limited data availability and high annotation costs. In this work, we present a comprehensive study of TTS for clinical decision making. We systematically investigate the interaction between TTS and inference strategies, including direct answering, chain-of-thought prompting, and two-stage reasoning. We generate multiple candidate outputs in parallel using large reasoning models and aggregate them via self-consistency decoding. This approach does not need any supervision while it leverages additional inference-time computation to improve the performance. We provide comprehensive empirical evaluation across both text-based medical question answering benchmarks and medical imaging modalities, demonstrating consistent improvements over single-pass inference baselines with performance gains of up to 30 percentage points. Finally, we provide an analytical characterization of TTS, deriving scaling laws that describe how performance improves with the number of samples and identifying conditions under which TTS yields reliable gains, along with empirical validation on diverse medical decision-making tasks.
Primary Subject Area: Generative Models
Secondary Subject Area: Foundation Models
Registration Requirement: Yes
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Midl Latex Submission Checklist: Ensure no LaTeX errors during compilation., Replace NNN with your OpenReview submission ID., Includes \documentclass{midl}, \jmlryear{2026}, \jmlrworkshop, \jmlrvolume, \editors, and correct \bibliography command., Did not override options of the hyperref package., Did not use the times package., Use the correct spelling and format, avoid Unicode characters, and use LaTeX equivalents instead., Any math in the title and abstract must be enclosed within $...$., Did not override the bibliography style defined in midl.cls and did not use \begin{thebibliography} directly to insert references., Avoid using \scalebox; use \resizebox when needed., Included all necessary figures and removed *unused* files in the zip archive., Removed special formatting, visual annotations, and highlights used during rebuttal., All special characters in the paper and .bib file use LaTeX commands (e.g., \'e for é)., No separate supplementary PDF uploads., Acknowledgements, references, and appendix must start after the main content.
Latex Code: zip
Copyright Form: pdf
Submission Number: 175
Loading