Recurrent Diffusion for Large-Scale Parameter Generation

Kai Wang; Dongwen Tang; Wangbo Zhao; Yang You

Recurrent Diffusion for Large-Scale Parameter Generation

Kai Wang, Dongwen Tang, Wangbo Zhao, Yang You

24 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: parameter generation

Abstract: Parameter generation has struggled to scale up for a long time, significantly lim- iting its range of applications. In this study, we introduce Recurrent diffusion for large-scale Parameter Generation, called RPG. We first divide the trained parame- ters into non-overlapping parts, after which a recurrent model is proposed to learn their relationships. The recurrent model’s outputs, as conditions, are then fed into a diffusion model to generate the neural network parameters. Using only a sin- gle GPU, recurrent diffusion enables us to generate popular vision and language models such as ConvNeXt-L and LoRA parameters of LLaMA-7B. Meanwhile, across various architectures and tasks, the generated parameters consistently per- form comparable results over trained networks. Notably, our approach also shows the potential to generate models for handling unseen tasks. This suggests that recurrent diffusion largely increases the practicality of parameter generation

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3796

Loading