Time Dependent Loss Reweighting for Flow Matching and Diffusion Models is Theoretically Justified
Keywords: Generative modeling; flow matching; diffusion models; generator matching
TL;DR: We clarify that time-dependent loss reweighting, often used in practice, is theoretically justified under a large class of generative models.
Abstract: Here we clarify that, in Generator Matching (which subsumes a large family of flow matching and diffusion models over continuous, manifold, and discrete spaces), both the Bregman divergence loss and the linear parameterization of the generator can depend on both the current state $X_t$ and the time $t$, and we show that the expectation over time in the loss can be taken with respect to a broad class of time distributions. We also show this for Edit Flows, which falls outside of Generator Matching. That the loss can depend on $t$ clarifies that time-dependent loss weighting schemes, often used in practice to stabilize training, are theoretically justified when the specific flow or diffusion scheme is a special case of Generator Matching (or Edit Flows). It also often simplifies the construction of $X_1$-predictor schemes, which are sometimes preferred for model-related reasons. We show examples that rely upon the dependence of linear parameterizations, and of the Bregman divergence loss, on $t$ and $X_t$.
Submission Number: 141
Loading