Best of Both Worlds: Combining General and Clinical Language Models for Classification and Text Generation
Keywords: Efficient Domain Adaptation, Clinical Adaptation, Language Models
TL;DR: Efficient, training-free clinical domain adaptation of frontier large language models.
Track: Findings
Abstract: We study proxy tuning, a training-free, decoding-time method that combines the strengths of general and clinical language models. Across three classification and four text generation tasks, zero-shot proxy tuning consistently improves performance over baselines, yielding an average 6.5\% Macro-F1 gain over a large general model on classification tasks and surpassing a 70B clinical model on all generative tasks. Our analysis reveals that proxy tuning isolating clinical continued pretraining produces the largest gains on medical knowledge-intensive tasks. We additionally introduce Cross-Architecture Proxy Tuning (CAPT), which enables proxy tuning across models with different architectures and limited logit distribution access. CAPT with a new-generation base model (Qwen3-30B) achieves performance comparable to supervised fine-tuning with 2,600 samples on classification tasks and produces 90\% clinically safe outputs on generation tasks. Our findings demonstrate that proxy tuning offers a practical, efficient path to clinical domain adaptation without model retraining.
General Area: Applications and Practice
Specific Subject Areas: Natural Language Processing
Supplementary Material: zip
Data And Code Availability: Yes
Ethics Board Approval: No
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Submission Number: 241
Loading