Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop AdaptationDownload PDF

Sep 30, 2021 (edited Dec 06, 2021)NeurIPS 2021 Workshop MetaLearn PosterReaders: Everyone
  • Keywords: meta-learning, few-shot classification, batch normalization
  • TL;DR: Counteracting batch normalization implicit learning rate decay increases inner-loop adaptation of meta-learning models.
  • Abstract: Meta-learning for few-shot classification has been challenged on its effectiveness compared to simpler pretraining methods and the validity of its claim of "learning to learn". Recent work has suggested that MAML-based models do not perform "rapid-learning" in the inner-loop but reuse features by only adapting the final linear layer. Separately, BatchNorm, a near ubiquitous inclusion in model architectures, has been shown to have an implicit learning rate decay effect on the preceding layers of a network. We study the impact of BatchNorm's implicit learning rate decay on feature reuse in meta-learning methods and find that counteracting it increases change in intermediate layers during adaptation. We also find that counteracting this learning rate decay sometimes improves performance on few-shot classification tasks.
  • Contribution Process Agreement: Yes
  • Author Revision Details: * We considered and made references to related papers suggested by Reviewer WLRm. * Clarified why we believe rapid learning is preferable over feature re-use in meta-learning models. * Re-ran hyperparameter tuning, allowing us to draw stronger conclusions in our discussion of results.
  • Process Comment: --
  • Poster Session Selection: Poster session #3 (16:50 UTC)
0 Replies

Loading