Modulating transfer between tasks in gradient-based meta-learning

Erin Grant; Ghassen Jerfel; Katherine Heller; Thomas L. Griffiths

Modulating transfer between tasks in gradient-based meta-learning

Erin Grant, Ghassen Jerfel, Katherine Heller, Thomas L. Griffiths

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Learning-to-learn or meta-learning leverages data-driven inductive bias to increase the efficiency of learning on a novel task. This approach encounters difficulty when transfer is not mutually beneficial, for instance, when tasks are sufficiently dissimilar or change over time. Here, we use the connection between gradient-based meta-learning and hierarchical Bayes to propose a mixture of hierarchical Bayesian models over the parameters of an arbitrary function approximator such as a neural network. Generalizing the model-agnostic meta-learning (MAML) algorithm, we present a stochastic expectation maximization procedure to jointly estimate parameter initializations for gradient descent as well as a latent assignment of tasks to initializations. This approach better captures the diversity of training tasks as opposed to consolidating inductive biases into a single set of hyperparameters. Our experiments demonstrate better generalization on the standard miniImageNet benchmark for 1-shot classification. We further derive a novel and scalable non-parametric variant of our method that captures the evolution of a task distribution over time as demonstrated on a set of few-shot regression tasks.

Keywords: meta-learning, clustering, learning-to-learn, mixture, hierarchical Bayes, hierarchical model, gradient-based meta-learning

TL;DR: We use the connection between gradient-based meta-learning and hierarchical Bayes to learn a mixture of meta-learners that is appropriate for a heterogeneous and evolving task distribution.

Data: [mini-Imagenet](https://paperswithcode.com/dataset/mini-imagenet)

25 Replies

Loading