Pi-E-Flow: Uncertainty-Guided Flow Distillation for Autoregressive Video Generation
Keywords: Diffusion Model, Autoregressive Video Model, Few-Step Distillation
TL;DR: Few-Step AR video model with uncertainty-based FlowMap distillation
Abstract: Few-step diffusion and flow distillation have made image generation much
faster, but the same recipe is brittle for autoregressive (AR) video. We
argue that the missing ingredient is not simply more computation, but
better allocation of computation: in AR video, the student faces highly
non-uniform uncertainty when imitating the teacher trajectory. Some
denoising steps, especially near the $t=0$ endpoint, are fragile, and
some spatial regions, such as motion, require substantially more refinement than static
backgrounds. We introduce _Pi-E-Flow_, an uncertainty-guided
flow distillation method for few-step AR video generation. Pi-E-Flow allocates generation computation along
two axes. Along denoising time, it measures step uncertainty with a
teacher-imitation error and chooses sampling schedules that balance the uncertainty
of few-step sampling. Along space, it trains the model on heterogeneous
patch timesteps, learns patch uncertainty, and assigns larger NFE budgets only
to uncertain patches while promoting completed patches into the AR cache.
This turns uniform few-step distillation into elastic compute allocation
over the parts of the video trajectory where the student is most uncertain.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 111
Loading