Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks

Dongwoo Lee; Dong Bok Lee; Steven Adriaensen; Juho Lee; Sung Ju Hwang; Frank Hutter; Seon Joo Kim; Hae Beom Lee

Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks

Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose a novel Bayesian neural scaling law extrapolation method based on Prior-data Fitted Networks.

Abstract: Scaling has been a major driver of recent advancements in deep learning. Numerous empirical studies have found that scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales. However, existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications involving decision-making problems such as determining the expected performance improvements achievable by investing additional computational resources. In this work, we explore a Bayesian framework based on Prior-data Fitted Networks (PFNs) for neural scaling law extrapolation. Specifically, we design a prior distribution that enables the sampling of infinitely many synthetic functions resembling real-world neural scaling laws, allowing our PFN to meta-learn the extrapolation. We validate the effectiveness of our approach on real-world neural scaling laws, comparing it against both the existing point estimation methods and Bayesian approaches. Our method demonstrates superior performance, particularly in data-limited scenarios such as Bayesian active learning, underscoring its potential for reliable, uncertainty-aware extrapolation in practical applications.

Lay Summary: As AI systems gain more resources, such as computing power or data, their performance typically improves. Predicting these improvements is vital for deciding whether to invest in additional resources. However, current tools often provide a single best guess without showing how certain it is, complicating decisions, especially with limited data. Our research presents the first method of its kind, using an uncertainty-aware Bayesian approach to predict AI scaling trends. We created a unique system, fine-tuned on numerous simulated examples of AI performance growth. This enables our method to forecast future performance, detect complex patterns—even if they shift unexpectedly—and indicate prediction confidence. Compared to traditional statistical methods, our approach offers more accurate predictions and better insight into their reliability, as demonstrated with real-world AI development data. This provides a trustworthy way to estimate future AI capabilities, supporting investment decisions and enhancing research efficiency by suggesting optimal experiments.

Link To Code: https://github.com/DongWooLee-Eli/nslpfn

Primary Area: Deep Learning->Foundation Models

Keywords: Neural Scaling Laws, Bayesian Inference, Prior-data Fitted Networks

Submission Number: 14355

Loading