Boosting Multi-Domain Fine-Tuning of Large Language Models through Evolving Interactions between Samples

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The multi-domain fine-tuning of large language models (LLMs) confronts a notorious trade-off among abilities across domains. Existing studies attribute this trade-off to the conflicts between samples rooted in inherent semantics. Recent approaches attempt to mitigate these conflicts through the empirical investigation or heuristic strategies. However, without a fundamental understanding of interactions between samples, they yield only marginal improvements, while incurring substantial trial-and-error costs. To address this challenge, we move beyond empirical studies by modeling interactions between samples as their influence on each other's loss, estimated using gradients. Intriguingly, we find that these interactions **evolve throughout training** rather than being purely determined by inherent semantics. Building on this insight, we propose **EV**olving **I**nteraction-guided **C**urriculum (**EVIC**), which iteratively selects samples that positively influence the overall dataset for training. By dynamically adapting the training curriculum to prioritize samples that contribute the most to the model training, EVIC effectively mitigates conflicts and improves the sample efficiency. Extensive experiments on a mixed dataset covering coding, math, and general tasks with several model architectures show that EVIC significantly outperforms all baselines across diverse capabilities.
Lay Summary: Training large language models (LLMs) on multiple domains—like math, coding, and general language—often causes a trade-off: improving in one domain can degrade performance in another. This challenge is commonly blamed on inherent semantic conflicts between training samples from different domains. Prior methods have tried to resolve this through empirical tuning or heuristics, but these strategies offer limited gains and require extensive trial-and-error. Our research introduces a new perspective by analyzing how training samples influence each other using gradient-based measurements. Surprisingly, we find that sample conflicts are not fixed by their content alone—they change over time as training progresses. Based on this insight, we propose EVIC, a method that dynamically adjusts the training process by selecting samples that are most helpful to overall model performance. This curriculum-style approach reduces harmful interference between domains and helps the model learn more effectively. Our experiments on mixed-domain datasets show that EVIC improves performance across all tasks and domains, while also requiring fewer training steps. This makes it a promising step toward more balanced and efficient multi-domain LLM training.
Primary Area: Deep Learning->Large Language Models
Keywords: curriculum learning, large language models, mixed data
Submission Number: 11580
Loading