Keywords: deep leaning, meta learning, hypernetworks, generative models, classifier guidance, contrastive learning, clip, classifier-free guidance, latent diffusion, diffusion models
TL;DR: We develop a meta-learning method that uses classifier(-free) guidance from the generative modeling literature to generate zero-shot adapted network weights.
Abstract: We aim to develop meta-learning techniques that achieve higher zero-shot performance than the state of the art on unseen tasks. To do so, we take inspiration from recent advances in generative modeling and language-conditioned image synthesis to propose meta-learning techniques that use natural language guidance for zero-shot task adaptation. We first train an unconditional generative hypernetwork model to produce neural network weights; then we train a second "guidance" model that, given a natural language task description, traverses the hypernetwork latent space to find high-performance task-adapted weights in a zero-shot manner. We explore two alternative approaches for latent space guidance: "HyperCLIP"-based classifier guidance and a conditional Hypernetwork Latent Diffusion Model ("HyperLDM"), which we show to benefit from the classifier-free guidance technique common in image generation. Finally, we demonstrate that our approaches outperform existing meta-learning methods with zero-shot learning experiments on our Meta-VQA dataset.