Keywords: Large Language Models, Strategy Selection, Prediction, Model Adaptation, Routing, Scaling Law
TL;DR: We show that performance–cost tradeoffs are predictable, which enables cost-effective model–strategy selection.
Abstract: Large language models (LLMs) achieve excellent performance across numerous tasks by using a diverse array of adaptation strategies. However, selecting the optimal combination of model, strategy, and configuration under resource constraints is challenging and typically requires extensive experimentation. We ask whether it is possible to accurately predict both downstream performance and cost without running expensive adaptation trials. We introduce COSMOS, a general analysis framework for approaching this strategy selection problem through low-cost prediction of adaptation outcomes. We instantiate our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Evaluations across eight representative benchmarks demonstrate that COSMOS instantiations achieve high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is possible, enabling practitioners to navigate large model--strategy--configuration spaces efficiently and substantially reduce deployment cost while maintaining strong task performance.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 10108
Loading