Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Published: 25 Jun 2025, Last Modified: 07 Jul 2025ICCV 2025 ConferenceEveryoneCC BY 4.0
Abstract: Video generation using diffusion models has shown remarkable progress, yet it remains computationally expensive due to the repeated processing of redundant features across blocks and steps. To address this, we propose a novel adaptive feature reuse mechanism that dynamically identifies and caches the most informative features by focusing on foreground and caching more on background, significantly reducing computational overhead with less sacrificing video quality. By leveraging the step and block caching, our method achieves up to 1.8× speed up on HunyuanVideo while maintaining competitive performance on Vbench, PSNR, SSIM, FID and LPIPS. Extensive experiments demonstrate that our approach not only improves efficiency but also enhances the quality of generated videos. The proposed method is generalizable and can be integrated into existing diffusion transformer frameworks.
Loading