Keywords: Few Shot Learning, Learning Instability
Abstract: Modern machine learning is highly data-intensive. Few-shot learning (FSL) aims to resolve this sample efficiency problem by learning from multiple tasks and quickly adapt to new tasks containing only a few samples. However, FSL problems proves to be significantly more challenging and require more compute expensive process to optimize. In this work, we consider multi-task linear regression (MTLR) as a canonical problem for few-shot learning, and investigate the source of challenge of FSL. We find that the MTLR exhibits local minimum problems that are not present in single-task problem, and thus making the learning much more challenging. We also show that the problem can be resolved by overparameterizing the model by increasing both the width and depth of the linear network and initializing the weights with small values, exploiting the implicit regularization bias of gradient descent-based learning.
1 Reply
Loading