Abstract: Currently, it is hard to reap the benefits of deep learning for Bayesian methods. We present Prior-Data Fitted Networks (PFNs), a method that allows to employ large-scale machine learning techniques to approximate a large set of posteriors. The only requirement for PFNs is the ability to sample from a prior distribution over supervised learning tasks (or functions). The method repeatedly draws a task (or function) from this prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points. Presented with samples from a new supervised learning task as input, it can then make probabilistic predictions for arbitrary other data points in a single forward propagation, effectively having learned to perform Bayesian inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems, with over 200-fold speedups in multiple setups compared to current methods. We obtain strong results in such diverse areas as Gaussian process regression and Bayesian neural networks, demonstrating the generality of PFNs.
Contribution Process Agreement: Yes
Poster Session Selection: Poster session #1 (12:00 UTC)