Extrapolating Large Models from the Small: Optimal Learning of Scaling Laws

ICLR 2026 Conference Submission10071 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, OOD, Scaling Laws, Peformance Prediction, realiable evaluation
Abstract: Evaluating large language models (LLMs) is increasingly critical yet prohibitively expensive as models scale to billions of parameters. Predictive evaluation via scaling laws has emerged as a cost-effective alternative, which extrapolates large-model performance from smaller ones, often by fitting power-law relationships. However, existing approaches lack formal guarantees on the predicted results and overlook the out-of-distribution nature of such extrapolation, leading to high instability. We address these challenges with three key contributions. First, we introduce Equivalent Sample Size (ESS), a natural and principled metric that quantifies prediction uncertainty by translating it into the number of test samples required for direct, in-distribution evaluation. Second, we analyze how extrapolation amplifies prediction variance and develop an efficient algorithm that optimally allocates smaller-model evaluations to maximize ESS under compute budgets. Third, experiments on both simulated and real datasets show that ESS and our algorithm guide the design of scaling-law learning, cut evaluation cost, and deliver reliable LLM performance predictions.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 10071
Loading