X-PlugVid: Versatile Adaptation of Image Plugins for Controllable Video Generation

Lingmin Ran; Chenyang Si; Xudong Lin; Jia-Wei Liu; Rui Zhao; Ziwei Liu; Jussi Keppo; Mike Zheng Shou

X-PlugVid: Versatile Adaptation of Image Plugins for Controllable Video Generation

Lingmin Ran, Chenyang Si, Xudong Lin, Jia-Wei Liu, Rui Zhao, Ziwei Liu, Jussi Keppo, Mike Zheng Shou

26 Sept 2024 (modified: 12 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: video generation, diffusion model, efficiency

Abstract: We introduce X-PlugVid, a unified framework designed to seamlessly adapt pretrained image-based plug-and-play modules for video diffusion models, facilitating controllable video generation without the need for retraining. This framework leverages a spatial-temporal adapter to effectively bridge the gap between image and video diffusion models. Specifically, we adopt a frozen copy of a large-scale pretrained image diffusion model (e.g. Stable Diffusion v1.5) as spatial prior. Then we train a spatial-temporal adapter to convert the prior into temporally consistent guidance for video diffusion models (e.g. SVD). To further enhance the effectiveness of image plugins in guiding video models, we introduce a timestep remapping strategy. Recognizing that denoising is an entropic reduction process, this strategy selects priors from later timesteps of the image model, which contain richer information, to be injected into the video models, optimizing the quality and consistency of the generated videos. Comprehensive experimental evaluations of X-PlugVid demonstrate its broad compatibility with diverse operational conditions and different plugins, confirming that leveraging priors from a pretrained diffusion model can minimize redundant training and enable versatile controllable video generation.

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6384

Loading