PRISM: A Paradigm for Controllable 3D Generation Driven by Structural Concept Prior

03 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Controllable 3D generation, Structural Concept, DiT
Abstract: The generation of high-quality 3D assets is essential for applications in virtual reality, robotics, and industrial systems. Existing methods can be sorted into three categories based on different priors. The first lines lift 2D diffusion as priors into 3D representations. The second lines adopt ground truth multi-view images as priors to directly regress 3D assets. The third lines tend to model the probabilistic distribution of 3D assets, which adopt 3D distribution as their priors. However, those three types of priors are in semantic level. They can represent semantic information but ignore the structural concept (describing the topological structures), which is crucial in the physical world applications. To address this limitation, we propose a novel 3D generation paradigm, called Prism (a paradigm driven by structural concept), which leverages structural concept as priors. First, our method encodes structural concept which is fused with real-world images to form prior representations, enabling the model to integrate high-level structural concept priors while guaranteeing shape details from real-world images. Then we adopt a pre-trained VAE encoder to provide embeddings of real 3D models. After that, we employ consistency loss in the latent space to align our priors with real 3D models to achieve mapping between concept space and 3D space, ensuring the generated 3D assets are structurally coherent, aligned with affordance, and visually realistic. Prism provides a high shape quality and structure controllable solution for 3D synthesis. We validate our method on both vision and robotics aspects with state-of-the-art algorithms. Our code will be public available.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 1720
Loading