Conformalized Scaling Laws: Distribution-Free Prediction Intervals for Out-of-Distribution Compute Regimes
Keywords: Conformal Prediction, Scaling Laws, Uncertainty Quantification
TL;DR: CSL replaces flawed OLS intervals with distribution-free conformal prediction to provide valid, honest uncertainty for LLM scaling extrapolation.
Abstract: Neural scaling laws (Kaplan et al., 2020; Hoffmann et al., 2022) are routinely used to predict the loss of language models at compute budgets beyond those of existing runs, guiding decisions that cost hundreds of millions of dollars. Yet the confidence intervals accompanying these predictions are derived under parametric assumptions, such as Gaussian residuals and correctly specified functional form, that are systematically violated during extrapolation. We show that standard ordinary least squares (OLS) confidence intervals undercover at out of distribution compute scales. In a controlled simulation on Pythia like scaling data (Biderman et al., 2023), OLS 95% intervals achieve only 61% joint empirical coverage at held out scales beyond the calibration range. We propose CSL (Conformalized Scaling Laws), which wraps any fitted scaling law with a split conformal prediction step using relative (log scale) residuals as the nonconformity score. We prove that CSL achieves valid $(1 − \alpha)$ marginal coverage for any pre trained scaling law without distributional assumptions. We further establish that the standard OLS interval systematically undercovers whenever the extrapolation distance exceeds the training residual scale, providing a quantitative condition for when practitioners must abandon parametric intervals. On synthetic Pythia scale data, CSL with $\alpha$= 0.10 achieves 89.7% empirical joint coverage (target: 90%), while OLS 95% achieves only 61.3%, with CSL producing intervals that are wider, more honest, and correctly calibrated.
Paper Type: Short (4 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 58
Loading