Learning to Learn with Generative Models of Neural Network Checkpoints

William Peebles; Ilija Radosavovic; Tim Brooks; Alexei A Efros; Jitendra Malik

Learning to Learn with Generative Models of Neural Network Checkpoints

William Peebles, Ilija Radosavovic, Tim Brooks, Alexei A Efros, Jitendra Malik

Published: 01 Feb 2023, Last Modified: 22 Jun 2025Submitted to ICLR 2023Readers: Everyone

Keywords: diffusion, DDPMs, learning to learn, generative models, transformers

Abstract: We explore a data-driven approach for learning to optimize neural networks. We construct a dataset of neural network checkpoints and train a generative model on the parameters. In particular, our model is a conditional diffusion transformer that, given an initial input parameter vector and a prompted loss, error, or return, predicts the distribution over parameter updates that achieve the desired metric. At test time, it can optimize neural networks with unseen parameters for downstream tasks in just one update. We find that our approach successfully generates parameters for a wide range of loss prompts. Moreover, it can sample multimodal parameter solutions and has favorable scaling properties. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Generative models

TL;DR: We construct a dataset of neural network checkpoints and train a loss-conditional generative model on the parameters. The generative model can train neural networks with unseen initializations in one step.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/learning-to-learn-with-generative-models-of/code)

10 Replies

Loading