VolumeDiffusion: Feed-forward text-to-3D generation with efficient volumetric encoder

Published: 2025, Last Modified: 14 Nov 2025Graph. Model. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a feed-forward encoder that directly transforms multi-view images of an object into a semantic volumetric neural representation.•We propose new noise schedules and the low-frequency noise techniques to effectively train diffusion models on the feature volumes.•We conduct extensive experiments and demonstrate the excellent generation quality and efficient inference capabilities of our method.
Loading