Keywords: Bayesian neural network, Gaussian processes, feature learning, statistical physics
Abstract: The power of neuronal networks comes from their adaptation to training data, known as feature learning. We consider feature learning within Bayesian learning and derive the two prominent high dimensional theories, kernel scaling and kernel adaptation, respectively, from a unified large deviation approach. We then show when feature learning escapes the scaling approach, but is captured by kernel adaptation.
Student Paper: Yes
Submission Number: 33
Loading