Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Santiago Aranguri; Francesco Insulla

Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Santiago Aranguri, Francesco Insulla

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, phase transitions, flow-based generative model, high-dimensional gaussian mixtures, denoising autoencoders, training schedules

TL;DR: We introduce a time-dilated training schedule for flow-based generative models that allows the learning of high-level features in high-dimensional settings by overcoming gradient vanishing and enabling phase-specific parameter learning.

Abstract: We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Building on the work of Cui et al. (2024), we find that the phase where the high-level features are learnt during training disappears as the dimension goes to infinity without an appropriate time schedule. We introduce a time dilation that solves this problem. This enables us to characterize the learnt velocity field, finding a first phase where the high-level feature (asymmetry between modes) is learnt and a second phase where the low-level feature (distribution of each mode) is learnt. We find that the autoencoder representing the velocity field learns to simplify by estimating only the parameters relevant to the feature for each phase. Turning to real data, we propose a method that, for a given feature, finds intervals of time where training improves accuracy the most on that feature, and we provide an experiment on MNIST validating this approach.

Supplementary Material: pdf

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12263

Loading