Abstract: Following rapid advancements in text and image generation, research has increasingly shifted towards 3D generation. Unlike the well-established pixel-based representation in images, 3D representations remain diverse and fragmented, encompassing a wide variety of approaches such as voxel grids, neural radiance fields, signed distance functions, point clouds, or octrees, each offering distinct advantages and limitations. 
In this work, we present a unified evaluation framework designed to assess the performance of 3D representations in reconstruction and generation. We compare these representations based on multiple criteria: quality, computational efficiency, and generalization performance. Beyond standard model benchmarking, our experiments aim to derive best practices over all steps involved in the 3D generation pipeline, including preprocessing, mesh reconstruction, compression with autoencoders, and generation. Our findings highlight that reconstruction errors significantly impact overall performance, underscoring the need to evaluate generation and reconstruction jointly. 
We provide insights that can inform the selection of suitable 3D models for various applications, facilitating the development of more robust and application-specific solutions in 3D generation. 
The code for our framework is available at https://github.com/isl-org/unifi3d.
Submission Length: Regular submission (no more than 12 pages of main content)
Video: https://drive.google.com/file/d/1QNvrNmQXo5F3ZKoq9ZdQ0iGaIp3BwrVn/view?usp=share_link
Code: https://github.com/isl-org/unifi3d
Assigned Action Editor: ~David_Fouhey2
Submission Number: 5001
Loading