3D-Pix: 3D Editing with Single-Shot Multi-View Diffusion and Gaussian Splatting

Published: 20 Dec 2025, Last Modified: 20 Dec 2025CVPR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D editing, multi-view diffusion, gaussian splatting, pix2pix, stable diffusion
TL;DR: 3D-Pix, a novel approach for 3D editing that combines custom multi-view modification of Instruct Pix2Pix with 3D reconstruction via Gaussian Splatting, for high-resolution editing of various 3D inputs including digital assets and real-life objects.
Abstract: The rapid advancements in 3D visual generative AI are driven by improvements in the quality and realism of 2D generative models, alongside recent developments in efficient 3D reconstruction techniques. In this work, we address the problem of 3D editing by developing a consistent multi-view 2D editing model and leveraging 3D reconstruction methods to obtain a 3D representation. Our approach generalizes across various inputs, including renderings of digital 3D assets and turntable videos of real-world objects. Furthermore, this generalization enables our method to be applied as a post-processing step to any existing 3D generative approach, regardless of the underlying geometry representation model. We introduce 3D-Pix, a model that integrates 2D generation with 3D reconstruction to facilitate 3D editing. A key component of our approach is MV Instruct Pix2Pix XL, a modified version of Instruct Pix2Pix, designed to generate consistent multi-view images of the same object using the Stable Diffusion XL image generation model. To ensure coherence across multiple views, we employ a novel interpolation mechanism that enables single-inference processing for consistent editing across multiple images. Additionally, we enhance output fidelity by incorporating a super-resolution upscaling step. The geometry of the asset is estimated using a state-of-the-art 3D Gaussian Splatting model. Our proposed 3D-Pix model effectively balances appearance refinement and geometric accuracy, particularly in preserving high-frequency details and achieving high-fidelity results.
Camera Ready Version: zip
Submission Number: 22
Loading