- Keywords: Bayesian Neural Network, Hierarchical Prior, Gaussian Process
- TL;DR: We introduce a Gaussian Process Prior over weights in a neural network and explore its ability to model input-dependent weights with benefits to various tasks, including uncertainty estimation and generalization in the low-sample setting.
- Abstract: Bayesian inference offers a theoretically grounded and general way to train neural networks and can potentially give calibrated uncertainty. However, it is challenging to specify a meaningful and tractable prior over the network parameters, and deal with the weight correlations in the posterior. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for the network parameters based on recently introduced unit embeddings that can flexibly encode weight structures, and (ii) input-dependent contextual variables for the weight prior that can provide convenient ways to regularize the function space being modeled by the network through the use of kernels. We show these models provide desirable test-time uncertainty estimates, demonstrate cases of modeling inductive biases for neural networks with kernels and demonstrate competitive predictive performance on an active learning benchmark.