Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning

Published: 02 May 2023, Last Modified: 02 May 2023Blogposts @ ICLR 2023 ConditionalReaders: Everyone
Keywords: meta-learning, model-agnostic, representation change, last layer
Abstract: In a previous study, Raghu et al. [2020] found that in model-agnostic meta-learning (MAML) for few-shot classification, the majority of changes observed in the network during the inner loop fine-tuning process occurred in the linear classification head. It is commonly believed that during this phase, the linear head remaps encoded features to the classes of the new task. In traditional MAML, the weights of the final linear layer are meta-learned in the usual way. However, there are some issues with this approach: First, it is difficult to imagine that a single set of optimal weights can be learned. This becomes apparent when considering class label permutations: two different tasks may have the same classes but in a different order. As a result, the weights that perform well for the first task will likely not be effective for the second task. This is reflected in the fact that MAML’s performance can vary by up to 15% depending on the class label assignments during testing. Second, more challenging datasets such as Meta-Dataset are being proposed as few-shot learning benchmarks. These datasets have varying numbers of classes per task, making it impossible to learn a single set of weights for the classification layer. Therefore, it seems logical to consider how to initialize the final classification layer before fine-tuning on a new task. Random initialization may not be optimal, as it can introduce unnecessary noise. This blog post will discuss different approaches to the last layer initialization that claim to outperform the original MAML method.
Blogpost Url: https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml/
ICLR Papers: https://openreview.net/forum?id=49h_IkpJtaE, https://openreview.net/forum?id=LDAwu17QaJz, https://openreview.net/forum?id=rkgAGAVKPr
ID Of The Authors Of The ICLR Paper: ~Han-Jia_Ye1, ~Chia_Hsiang_Kao1, ~Eleni_Triantafillou1
Conflict Of Interest: No
5 Replies

Loading