Keywords: 3D generation, 3D representation, 3D reconstruciton, diffusion model
TL;DR: We propose a novel 3D representation that enhances 3D synthesis by capturing intricate object details, both inside and outside!
Abstract: We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans. X-Ray transforms a 3D object into a series of surface frames at different layers, making it suitable for generating 3D models from images. Our method utilizes ray casting from the camera center to capture geometric and textured details, including depth, normal, and color, across all intersected surfaces. This process efficiently condenses the whole 3D object into a multi-frame video format, motivating the utilize of a network architecture similar to those in video diffusion models. This design ensures an efficient 3D representation by focusing solely on surface information. Also, we propose a two-stage pipeline to generate 3D objects from X-Ray Diffusion Model and Upsampler. We demonstrate the practicality and adaptability of our X-Ray representation by synthesizing the complete visible and hidden surfaces of a 3D object from a single input image. Experimental results reveal the state-of-the-art superiority of our representation in enhancing the accuracy of 3D generation, paving the way for new 3D representation research and practical applications.
Our project page is in \url{https://tau-yihouxiang.github.io/projects/X-Ray/X-Ray.html}.
Primary Area: Machine vision
Submission Number: 2779
Loading