Keywords: meta-learning, hypernetworks
TL;DR: We propose a hypernetwork-based approach to meta-learning that learns to map from task distributions directly to initial weights.
Abstract: Meta-learning recovers the Bayes-optimal learner for a particular distribution over tasks. However, meta-learning is slow and requires retraining from scratch if we want to modify the environment. In our work, we directly learn a mapping from a task distribution to the Bayes-optimal parameters of the learner (for a neural network, these are the initial weights of the network). We provide theoretical results characterizing the optimal mapping in the case of linear-Gaussian models and then demonstrate that hypernetworks can be used to learn this mapping from empirical data for both linear and non-linear models. This approach reduces the computational resources required to make adaptive Bayes-optimal learners: by leveraging the underlying structure of task distributions, we can meta-learn once and then quickly adapt to new settings with a single forward pass through the learned mapping.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 20252
Loading