Keywords: Single Image to 3D Generation, Retrieve-and-Edit, 3D Prior, Flat-Colored Images, Latent Space Editing
TL;DR: REVIVE3D is a training-free framework that generates high-fidelity 3D models from single, flat-colored images by retrieving a shape-aligned 3D prior and directly editing its latent space to inject the input's geometry and fine details.
Abstract: We introduce REVIVE3D, a retrieve-and-edit framework for single image-to-3D generation. Our method is designed for flat-colored images such as cartoons and drawings with minimal shading. Instead of 2D preprocessing, REVIVE3D first retrieves a shape-aligned 3D prior and then edits it directly in a 3D latent space. The edit is guided by the visual difference between the input image and an aligned render of the retrieved prior. This direct 3D operation injects the missing volumetric cues and preserves the global structure of the shape. The method is plug-and-play and requires no retraining. It produces results in approximately 2 minutes using only 0.6 GB of memory on a single GPU. On the Art3D test set, REVIVE3D achieves state-of-the-art image-to-3D alignment and consistently reconstructs complete geometry with fine details, while also performing well on standard images. These results demonstrate that direct latent editing of a retrieved 3D prior is an effective and practical route to high-fidelity 3D from flat-colored images.
Submission Number: 14
Loading