Video diffusion generation: comprehensive review and open problems

Wenping Ma, Xiaoting Yang, Licheng Jiao, Lingling Li, Xu Liu, Fang Liu, Puhua Chen, Yuting Yang, Mengru Ma, Long Sun, Ruohan Zhang, Xueli Geng, Yuwei Guo, Shuyuan Yang, Zhixi Feng

Published: 2025, Last Modified: 25 Mar 2026Artif. Intell. Rev. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Video generation has become an increasingly important component of AI-generated content (AIGC), owing to its rich semantic expressiveness and growing application potential. Among various generative paradigms, diffusion models have recently gained prominence due to their strong controllability, competitive visual quality, and compatibility with multimodal inputs. However, most existing surveys provide limited coverage of diffusion-based video generation, often lacking systematic analysis and comprehensive comparisons. To address this gap, this paper presents a thorough and structured review of diffusion models for video generation. We first outline the theoretical foundations and core architectures of diffusion models, and then the key design principles of representative methods for video generation were introduced. We propose a unified taxonomy that categorizes over two hundred methods, analyzing their key characteristics, strengths, and limitations. In addition, we compared the performance of classical methods and summarized commonly used datasets and evaluation metrics in this field for ease of model benchmarking and selection. Finally, we discuss open problems and future research directions, aiming to provide a valuable reference for both academic research and practical development.

External IDs:dblp:journals/air/MaYJLLLCYMSZGGYF25