Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, LIU Yaofu, Xianfeng Wu, Mingzhe Zheng, Zihao Wang, Ser-Nam Lim, Harry Yang

Published: 25 Jun 2025, Last Modified: 28 Jan 2026ICCV 2025 ConferenceEveryoneRevisionsCC BY 4.0

Abstract: Video generation using diffusion models has shown remarkable progress, yet it remains computationally expensive due to the repeated processing of redundant features across blocks and steps. To address this, we propose a novel adaptive feature reuse mechanism that dynamically identifies and caches the most informative features by focusing on foreground and caching more on background, significantly reducing computational overhead with less sacrificing video quality. By leveraging the step and block caching, our method achieves up to 1.8× speed up on HunyuanVideo while maintaining competitive performance on Vbench, PSNR, SSIM, FID and LPIPS. Extensive experiments demonstrate that our approach not only improves efficiency but also enhances the quality of generated videos. The proposed method is generalizable and can be integrated into existing diffusion transformer frameworks.