Track: tiny / short paper (up to 4 pages)
Keywords: Conditional Diffusion Transformer, Statistical Rates, Approximation, Estimation
TL;DR: We establish the statistical foundations for Conditional Diffusion Transformers (DiTs) with classifier-free guidance.
Abstract: We explore the statistical foundations of conditional diffusion transformers (DiTs) with classifier-free guidance. Through a comprehensive analysis of ``in-context'' conditional DiTs under four data assumptions, we demonstrate that both conditional DiTs and their latent variants achieve minimax optimality for unconditional DiTs. By discretizing input domains into infinitesimal grids and performing term-by-term Taylor expansions on the conditional score function, we enable leveraging transformers' universal approximation capabilities through detailed piecewise constant approximations, resulting in tighter bounds. Extending our analysis to the latent setting under a linear latent subspace assumption, we show that latent conditional DiTs achieve lower bounds than their counterparts in both approximation and estimation.
We also establish the minimax optimality of latent unconditional DiTs. Our findings provide statistical limits for conditional and unconditional DiTs and offer practical guidance for developing more efficient and accurate models.
Submission Number: 84
Loading