HyperFlow: Gradient-Free Emulation of Few-Shot Fine-Tuning

14 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: few-shot learning, test-time adaptation, gradient flows
TL;DR: a computationally efficient alternative adaptation mechanism for gradient-based few-shot fine-tuning
Abstract: While test-time fine-tuning is beneficial in few-shot learning, the need for multiple backpropagation steps can be prohibitively expensive in resource-constrained environments or end devices. To address this limitation, we propose a computationally efficient test-time adaptation approach that emulates gradient descent without computing gradients. Specifically, we formulate gradient descent as an Euler discretization of an ordinary differential equation (ODE) and train a lightweight auxiliary network to predict the task-conditional drift using only the few-shot support set. The adaptation then reduces to a simple numerical integration (e.g., via the Euler method), which requires only a few forward passes of the auxiliary network—no gradients or forward passes of the target model are needed. In experiments on cross-domain few-shot classification using the Meta-Dataset and CD-FSL benchmarks, our method significantly improves out-of-domain performance over the non-fine-tuned baseline while incurring only 6% of peak memory and 0.1% of the FLOPs of standard fine-tuning, thus establishing a practical middle ground between direct transfer and fine-tuning approaches.
Supplementary Material: pdf
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 5204
Loading