Transformers Can Do Bayesian-Inference By Meta-Learning on Prior-Data

Samuel Müller; Noah Hollmann; Sebastian Pineda Arango; Josif Grabocka; Frank Hutter

Transformers Can Do Bayesian-Inference By Meta-Learning on Prior-Data

Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, Frank Hutter

Published: 10 Dec 2021, Last Modified: 05 May 2023NeurIPS 2021 Workshop MetaLearn PosterReaders: Everyone

Abstract: Currently, it is hard to reap the benefits of deep learning for Bayesian methods. We present Prior-Data Fitted Networks (PFNs), a method that allows to employ large-scale machine learning techniques to approximate a large set of posteriors. The only requirement for PFNs is the ability to sample from a prior distribution over supervised learning tasks (or functions). The method repeatedly draws a task (or function) from this prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points. Presented with samples from a new supervised learning task as input, it can then make probabilistic predictions for arbitrary other data points in a single forward propagation, effectively having learned to perform Bayesian inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems, with over 200-fold speedups in multiple setups compared to current methods. We obtain strong results in such diverse areas as Gaussian process regression and Bayesian neural networks, demonstrating the generality of PFNs.

Contribution Process Agreement: Yes

Poster Session Selection: Poster session #1 (12:00 UTC)

0 Replies

Loading