HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

Junying Chen; Xidong Wang; Ke Ji; Anningzhe Gao; Feng Jiang; Shunian Chen; Hongbo Zhang; Song Dingjie; Wenya Xie; Chuyi Kong; Jianquan Li; Xiang Wan; Haizhou Li; Benyou Wang

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

Junying Chen, Xidong Wang, Ke Ji, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Song Dingjie, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Science of LMs, Engineering for large LMs, Learning algorithms for LMs

Keywords: domain adaption, one-stage training, data sampling strategy, specialized LLM

TL;DR: A unified one-stage protocol for domain adaptation and a strong Chinese medical LLM.

Abstract: Adapting a language model (LM) into a specific domain, *a.k.a* domain adaption, is a common practice when specialized knowledge, e.g. medicine, is not encapsulated in a general language model like Llama2. This typically involves a two-stage process including *continued pre-training* and *supervised fine-tuning*. Implementing a pipeline solution with these two stages not only introduces complexities (necessitating dual meticulous tuning) but also leads to two occurrences of data distribution shifts, exacerbating catastrophic forgetting. To mitigate these, we propose a one-stage domain adaption protocol where heterogeneous data from both the traditional pre-training and supervised stages are unified into a simple instruction-output pair format to achieve efficient knowledge injection. Subsequently, a data priority sampling strategy is introduced to adaptively adjust data mixture during training. Following this protocol, we train HuatuoGPT-II, a specialized LLM for the medical domain in Chinese. HuatuoGPT-II achieve competitive performance with GPT4 across multiple benchmarks, which especially shows the state-of-the-art (SOTA) performance in multiple Chinese medical benchmarks and the newest pharmacist licensure examinations. Furthermore, we explore the phenomenon of one-stage protocols, and the experiments reflect that the simplicity of the proposed protocol improves training stability and domain generalization.

Supplementary Material: zip

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 346

Loading