STAF-LLM: A scalable and task-adaptive fine-tuning framework for large language models in medical domain

Tianhan Xu, Ling Chen, Zhe Hu, Bin Li

Published: 10 Apr 2025, Last Modified: 26 Oct 2025OpenReview Archive Direct UploadEveryoneRevisionsCC BY 4.0

Abstract: Recent large language models (LLMs) have demonstrated remarkable performance across various NLP tasks. However, their application in the medical domain is often limited by a lack of specialized medical knowledge, which is crucial for practical clinical tasks. In this work, we propose STAF-LLM, a scalable and taskadaptive fine-tuning framework designed to customize general-purpose LLMs for diverse downstream medical applications. STAF-LLM consists of two stages: expert model training and task adaptation. In the first stage, we design 12 core medical tasks and use AdaLoRA to train 12 expert models on these tasks with a unified instruction format, transferring the learned domain-specific knowledge to the general-purpose LLM. In the second stage, a task-guided router is trained for each downstream application to adaptively combine the expert knowledge with the LLM, dynamically selecting the most relevant knowledge for inference. Experiments on 9 medical tasks, including 3 unseen ones, show that STAF-LLM outperforms Llama 2 by 10%–30%. Notably, STAF-LLM achieves state-of-the-art performance on benchmark tasks like ICD coding.