Abstract: Large Language Models (LLMs) have achieved impressive performance in multiple domains through Supervised Fine-tuning (SFT). When training on multiple capabilities simultaneously, it usually leads to suboptimal performance compared to single-capability specialization, which suggests inherent conflicts between data sources during multi-objective SFT and hence restricts the model's ability to harmonize diverse skills effectively.
To address this challenge, in this paper, we propose IDEAL, an Influence-based Data Equilibrium Adaptation framework, which aims to optimize the mixture proportions of distinct SFT datasets based on their task-specific performance. IDEAL employs a machine learning-driven approach based on influence function to iteratively refine the data allocation strategy, prioritizing datasets that enhance target capabilities while mitigating inter-domain conflicts.
Experiments across different capabilities demonstrate that IDEAL significantly outperforms conventional uniform data allocation strategies, achieving balanced improvements across diverse tasks without compromising individual capabilities. Our work highlights the critical role of data composition in multi-capability SFT and provides a scalable solution for training generalist LLMs through data ratio adaptation.
Paper Type: Short
Research Area: Machine Learning for NLP
Research Area Keywords: NLP Applications
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 3210
Loading