Keywords: few-shot, meta-learning, prototypical network, fine-tuning, prototypical classifier
TL;DR: We derive a novel generalization bound for a prototypical classifier and theoretically and empirically show that focusing on the variance of the norm of a feature vector can improve performance.
Abstract: The prototypical network is a prototype classifier based on meta-learning and is widely used for few-shot learning because it classifies unseen examples by constructing class-specific prototypes without adjusting hyper-parameters during meta-testing.
Interestingly, recent research has attracted a lot of attention, showing that training a new linear classifier, which does not use a meta-learning algorithm, performs comparably with the prototypical network.
However, the training of a new linear classifier requires the retraining of the classifier every time a new class appears.
In this paper, we analyze how a prototype classifier works equally well without training a new linear classifier or meta-learning.
We experimentally find that directly using the feature vectors, which is extracted by using standard pre-trained models to construct a prototype classifier in meta-testing, does not perform as well as the prototypical network and training new linear classifiers on the feature vectors of pre-trained models.
Thus, we derive a novel generalization bound for a prototypical classifier and show that the transformation of a feature vector can improve the performance of prototype classifiers.
We experimentally investigate several normalization methods for minimizing the derived bound and find that the same performance can be obtained by using the L2 normalization and minimizing the ratio of the within-class variance to the between-class variance without training a new classifier or meta-learning.
Supplementary Material: pdf
11 Replies
Loading