Revisiting Neural Networks for Few-Shot Learning: A Zero-Cost NAS Perspective

Haidong Kang

Revisiting Neural Networks for Few-Shot Learning: A Zero-Cost NAS Perspective

Haidong Kang

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Neural Architecture Search (NAS) has recently outperformed hand-designed networks in various artificial intelligence areas. However, previous works only target a pre-defined task. For a new task in few-shot learning (FSL) scenarios, the architecture is either searched from scratch, which is neither efficient nor flexible, or borrowed architecture from the ones obtained on other tasks, which may lead to sub-optimal. Can we select the best neural architectures without involving any training and eliminate a significant portion of the search cost for new tasks in FSL? In this work, we provide an affirmative answer by proposing a novel information bottleneck (IB) theory driven \textit{Few-shot Neural Architecture Search} (dubbed, IBFS) framework to address this issue. We first derive that the global convergence of Model-agnostic meta-learning (MAML) can be guaranteed by only considering the first-order loss landscape. Moreover, motivated by the observation that IB provides a unified view toward understanding machine learning models, we propose a novel Zero-Cost method tailored for FSL to rank and select architectures based on their \textit{expressivity} obtained by IB mechanisms. Extensive experiments show that IBFS achieves state-of-the-art performance in FSL without training, which demonstrates the effectiveness of our IBFS.

Lay Summary: Can we design the best neural network for a new task without any training? This is an extremely challenging question for the deep learning community, especially when only a few datas are available. We wanted to answer this question by using a method called IB driven Few-shot Neural Architecture Search (IBFS) to well evaluate the expressivity of the architecture sampled from the search space for few-shot learning in a training-free manner. Our paper presents the surprising result that for few-shot learning problems, and obtain state-of-the-art performance. Our findings have implications for how we measure expressivity of the architecture tailored for few-shot learning, generalize to unseen tasks without expensive trial-and-error training, and demonstrate that neural networks play an important role in few-shot learning.

Primary Area: General Machine Learning->Representation Learning

Keywords: Few-shot learning, Training-free, MAML, Representation Learning

Submission Number: 10149

Loading