Few-shot Learning with Online Self-Distillation

Yue Wang; Sihan Liu

Few-shot Learning with Online Self-Distillation

Yue Wang, Sihan Liu

Published: 06 Aug 2021, Last Modified: 05 May 2023VIPriors 2021 OralPosterTBDReaders: Everyone

Keywords: few-shot learning, data augmentation, CutMix

TL;DR: We achieves new state-of-the-art on few shot learning benchmarks when combining online self-distillation with CutMix augmentation.

Abstract: Few-shot learning has been a long-standing problem in learning to learn. This problem typically involves training a model on a extremely small amount of data and testing the model on the out-of-distribution data. The focus of recent few-shot learning research has been on the development of good representation models that can quickly adapt to test tasks. To that end, we come up with a model that learns representation through online self-distillation. Our model combines supervised training with knowledge distillation via a continuously updated teacher. We also identify that data augmentation plays an important role in producing robust features. Our final model is trained with CutMix augmentation and online self-distillation. On the commonly used benchmark miniImageNet, our model achieves 67.07\% and 83.03\% under the 5-way 1-shot setting and the 5-way 5-shot setting, respectively. It outperforms counterparts of its kind by 2.25\% and 0.89\%.

1 Reply

Loading