Multilevel Analysis of Biomedical Domain Adaptation of Llama 2: What Matters the Most? A Case Study

Vicente Ivan Sanchez Carmona, Shanshan Jiang, Takeshi Suzuki, Bin Dong

Published: 15 Aug 2024, Last Modified: 26 Mar 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Domain adaptation of Large Language Mod els (LLMs) leads to models better suited for a particular domain by capturing patterns from domain text which leads to improvements in downstream tasks. To the naked eye, these improvements are visible; however, the pat terns are not so. How can we know which pat terns and how much they contribute to changes in downstream scores? Through a Multilevel Analysis we discover and quantify the effect of text patterns on downstream scores of domain adapted Llama 2 for the task of sentence sim ilarity (BIOSSES dataset). We show that text patterns from PubMed abstracts such as clear writing and simplicity, as well as the amount of biomedical information, are the key for im proving downstream scores. Also, we show how another factor not usually quantified con tributes equally to downstream scores: choice of hyperparameters for both domain adaptation and fine-tuning.