Keywords: meta-learning, ai, artificial intelligence, machine learning, artificial general intelligence, deep learning, neural networks
TL;DR: We show that by only changing the task previously believed meta-learning algorithms (MAML) that only work by feature re-use in fact exhibit rapid learning
Abstract: It has been recently observed that a good embedding is all we need to solve many few-shot learning benchmarks.
In addition, other work has strongly suggested that Model Agnostic Meta-Learning (MAML) mostly works via this same method -- by learning a good embedding.
This highlights our lack of understanding of what meta-learning algorithms are doing and when they work.
In this work we provide empirical results that shed some light towards understanding meta-learning algorithms better.
In particular we identify three interesting properties:
1) In contrast to previous work, we show that it is possible to define a family of synthetic benchmarks that result in a low degree of feature re-use -- suggesting that
current few-shot learning benchmarks {\em might not have the properties} needed for the success of meta-learning algorithms;
2) meta-overfitting occurs when the number of classes (or concepts) are finite, and this issue disappears once the task has an {\em unbounded} number of concepts (e.g. online learning);
3) more adaptation at meta-test time with MAML does not necessarily result in a significant representation change or even an improvement in meta-test performance -- even when training on our proposed synthetic benchmarks.
Finally, we suggest that, to understand meta-learning algorithms better, it is imperative that we go beyond tracking only absolute performance and in addition formally quantify the degree of meta-learning and track both metrics together.
Reporting results in future work this way will help us identify the sources of meta-overfitting more accurately, and hopefully design more flexible meta-learning algorithms that learn beyond fixed feature re-use.
0 Replies
Loading