Unlocking Latent Medical Reasoning in LLMs via Inference-Time Representation and Prefix Interventions
Keywords: Medical Reasoning, Representation Engineering, Prefix Tuning, Large Language Models, Data Efficiency
Abstract: Recent reasoning advances in large language models (LLMs) have broadened their applicability to medical tasks.
Yet most prior work remains dependent on scarce, high-quality rationales and compute-intensive post-training, with limited exploration of how to leverage the medical capabilities acquired during pretraining.
Consequently, a key challenge is how to elicit these latent capabilities in a data-efficient manner.
To address this gap, our study introduces RIPT, a lightweight framework for data-efficient capability activation.
RIPT explicitly decomposes the objective into two complementary components: reasoning enhancement and medical knowledge elicitation.
For the former, we extract steering vectors from hidden activations on a small set of high-quality paired reasoning/direct responses to shape LLMs' reasoning behavior.
For the latter, we obtain prefix vectors via prefix tuning on simple medical QA pairs to elicit domain-specific knowledge.
At inference, we freeze the backbone LLM and apply a hybrid intervention that jointly injects both steering and prefix vectors.
Experiments under limited-resource settings show that RIPT consistently outperforms strong baselines, suggesting an efficient pathway for unlocking LLMs’ medical reasoning capabilities.
Paper Type: Long
Research Area: Clinical and Biomedical Applications
Research Area Keywords: representation learning, parameter-efficient-training, data-efficient training, healthcare applications, clinical NLP, biomedical QA, reasoning
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 2218
Loading