Scaling Laws for Diffusion Transformers

Zhengyang Liang; Hao He; Ceyuan Yang; Bo Dai

Scaling Laws for Diffusion Transformers

Zhengyang Liang, Hao He, Ceyuan Yang, Bo Dai

19 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Scaling Laws, Diffusion Models, Transformers, Generative Models

Abstract: Diffusion transformers (DiT) have already achieved appealing synthesis and scaling properties in content recreation, \emph{e.g.,} image and video generation. However, scaling laws of DiT are less explored, which usually offer precise predictions regarding optimal model size and data requirements given a specific compute budget. Therefore, experiments across a broad range of compute budgets, from \texttt{1e17} to \texttt{6e18} FLOPs are conducted to confirm the existence of scaling laws in DiT \emph{for the first time}. Concretely, the loss of pretraining DiT also follows a power-law relationship with the involved compute. Based on the scaling law, we can not only determine the optimal model size and required data but also accurately predict the text-to-image generation loss given a model with 1B parameters and a compute budget of \texttt{1e21} FLOPs. Additionally, we also demonstrate that the trend of pretraining loss matches the generation performances (\emph{e.g.,} FID), even across various datasets, which complements the mapping from compute to synthesis quality and thus provides a predictable benchmark that assesses model performance and data quality at a reduced cost.

Supplementary Material: pdf

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1805

Loading