Keywords: Image to 3D; Mutli-view Diffusion; Novel View Synthesis
Abstract: Trained on massive datasets, video diffusion models have shown strong generative priors for novel view synthesis tasks. Existing methods finetune these models to synthesize 360-degree orbit videos from input images. While these methods demonstrate the pretrained models' generalization ability, they are limited by the assumption of temporal attention and struggle to generate highly consistent results. Additionally, generating novel views as a sequence of twenty or more frames incurs high computational costs compared to sparse view synthesis methods. Sparse novel view synthesis methods finetuned from traditional 2D diffusion models, on the other hand, can generate highly consistent images from arbitrary camera positions but suffer from poor generalization, leading to unsatisfactory results on out-of-domain inputs. In this paper, we explore leveraging video diffusion models' rich generative priors to enhance sparse novel view generation models. Specifically, we investigate the generation process of video diffusion models and unearth key observations to extract geometrical priors from them. Based on this, we propose a novel framework, U3D, for sparse novel view synthesis. U3D includes a geometrical reference network to integrate these priors into the sparse novel view synthesis network and a temporal enhanced sparse view generation network to preserve pretrained temporal knowledge. By leveraging the significant generative priors from video diffusion models, our framework can synthesize highly consistent sparse novel views with strong generalization ability, which can be reconstructed into high-quality 3D assets using feed-forward sparse view reconstruction methods.
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7470
Loading