Boosting Meta-Training with Base Class Information for Few-Shot Learning

21 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Few-shot learning, meta-learning
TL;DR: We propose a new end-to-end training paradigm that boosts meta-training with base class gradient information for few-shot learning.
Abstract: Few-shot learning aims to learn a classifier that could be adapted to recognize new classes unseen during training with limited labeled examples. Meta-learning has recently become the most important framework for few-shot learning. Its training framework is originally a task-level learning method, such as Model-Agnostic Meta-Learning (MAML) and Prototypical Networks. And a recently proposed training paradigm, Meta-Baseline that consists of sequential pre-training and meta-training stages, gains state-of-the-art performance. However, Meta-Baseline is not an end-to-end method, which means the meta-training stage can only begin after the completion of pre-training, leading to longer training time. Moreover, the two training stages would adversely affect each other, resulting in a decline in the latter training periods that is even lower than that of Prototypical Networks. In this work, motivated by the optimization method of stochastic variance reduced gradient, we propose a new end-to-end training paradigm consisting of two alternate loops. In the outer loop, we calculate the cross entropy loss on the whole training set but only update the final linear layer; while in the inner loop, we utilize the original meta-learning training mode to calculate the loss and incorporate the outer loss gradient to guide the parameter update. This training paradigm not only converges quickly but also outperforms the baselines, indicating that information from the overall training set and the meta-learning training paradigm could mutually reinforce one another.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3447
Loading