The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs

Zhiliang Chen; Alfred Wei Lun Leong; Shao Yong Ong; Apivich Hemachandra; Gregory Kang Ruey Lau; Chuan-Sheng Foo; Zhengyuan Liu; Nancy F. Chen; Bryan Kian Hsiang Low

The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs

Zhiliang Chen, Alfred Wei Lun Leong, Shao Yong Ong, Apivich Hemachandra, Gregory Kang Ruey Lau, Chuan-Sheng Foo, Zhengyuan Liu, Nancy F. Chen, Bryan Kian Hsiang Low

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Co-optimization, Joint Optimization of LLM data and model configurations

TL;DR: We introduce JoBS, an efficient algorithm to co-optimize data and model configurations for LLMs - an endeavor once thought to be intractable.

Abstract: Co-optimizing data and model configurations for LLMs presents a classic chicken-and-egg dilemma: the best training data configuration (e.g., training data composition) depends on the chosen model configuration (e.g., model architecture, fine-tuning configuration), but the best model configuration also depends on the chosen training data. However, jointly optimizing both data and model configurations is intractable, with existing methods focusing only on data or model selection in isolation without considering their complex interdependence. We introduce JoBS, an efficient method that jointly optimizes LLM training data and model configurations by framing the problem as a black-box optimization problem. Central to our method is a novel performance scaling law predictor, which learns a diverse family of performance scaling laws for different configurations and cheaply predicts how promising a particular training configuration is. This enables us to quickly build an approximate LLM performance landscape and efficiently find optimal training configurations with Bayesian Optimization (BO). JoBS not only outperforms existing baselines across diverse tasks in the fine-tuning setting, but also runs up to 12.4× faster. We hope our work draws more attention to the chicken-and-egg dilemma inherent in co-optimizing training configurations for LLMs. Our anonymized code is available at https://github.com/a35453779/JoBS.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 10060

Loading