Does MAML Only Work via Feature Re-use?

Anonymous

Does MAML Only Work via Feature Re-use?

Anonymous

30 Sept 2021 (modified: 05 May 2023)NeurIPS 2021 Workshop MetaLearn Blind SubmissionReaders: Everyone

Keywords: meta-learning, ai, artificial intelligence, machine learning, artificial general intelligence, deep learning, neural networks

TL;DR: We show that by only changing the task previously believed meta-learning algorithms (MAML) that only work by feature re-use in fact exhibit rapid learning

Abstract: It has been recently observed that a good embedding is all we need to solve many few-shot learning benchmarks. In addition, other work has strongly suggested that Model Agnostic Meta-Learning (MAML) mostly works via this same method -- by learning a good embedding. This highlights our lack of understanding of what meta-learning algorithms are doing and when they work. In this work we provide empirical results that shed some light towards understanding meta-learning algorithms better. In particular we identify three interesting properties: 1) In contrast to previous work, we show that it is possible to define a family of synthetic benchmarks that result in a low degree of feature re-use -- suggesting that current few-shot learning benchmarks {\em might not have the properties} needed for the success of meta-learning algorithms; 2) meta-overfitting occurs when the number of classes (or concepts) are finite, and this issue disappears once the task has an {\em unbounded} number of concepts (e.g. online learning); 3) more adaptation at meta-test time with MAML does not necessarily result in a significant representation change or even an improvement in meta-test performance -- even when training on our proposed synthetic benchmarks. Finally, we suggest that, to understand meta-learning algorithms better, it is imperative that we go beyond tracking only absolute performance and in addition formally quantify the degree of meta-learning and track both metrics together. Reporting results in future work this way will help us identify the sources of meta-overfitting more accurately, and hopefully design more flexible meta-learning algorithms that learn beyond fixed feature re-use.

0 Replies

Loading