We present \emph{VEnhancer}, a generative space-time enhancement method that can improve the existing AI-generated videos spatially and temporally through one video diffusion model. Given a generated low-quality video, our approach can increase its spatial and temporal resolution simultaneously with arbitrary up-sampling space and time scales by adding more details in spatial domain and synthesize detailed motion in temporal domain. Furthermore, VEnhancer is able to remove generated spatial artifacts and temporal flickering of generated videos.
To achieve this, basing on a pretrained generative video prior, we train a \textbf{S}pace-\textbf{T}ime Controller and inject it to the prior as a condition on low-frame-rate and low-resolution videos. To effectively train this ST-Controller, we design \textit{space-time data augmentation} to create diversified video training pairs as well as \textit{video-aware conditioning} for realizing different augmentation parameters in both spatial and temporal dimensions.
Benefiting from the above designs, VEnhancer can be end-to-end trained to enable multi-function in one single model.
Extensive experiments show that VEnhancer
surpasses existing state-of-the-art video super-resolution and space-time super-resolution methods in enhancing AI-generated videos. Moreover, VEnhancer is able to greatly improve the performance of open-source state-of-the-art text-to-video methods on video generation benchmark, VBench.
Keywords: Diffusion models, Video Generation, Generative Video enhancement, video super-resolution, frame interpolation, space-time super-resolution
Abstract:
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10660
Loading