Keywords: 3D generative models, diffusion models
TL;DR: We propose a compression framework that reduces 3D generative model size while preserving quality. By leveraging Transformer layer vitality, we combine pruning, adaptive quantization, and finetuning, enabling efficient 3D generation.
Abstract: We propose a novel compression framework for image-to-3D generative models that substantially reduces model size while preserving synthesis fidelity. Current Diffusion Transformer (DiT) architectures achieve impressive quality but remain excessively expensive due to large parameter counts and memory demands. Unlike prior work that focuses only on inference acceleration, our approach directly reduces model capacity by leveraging layer vitality—quantifying each layer’s contribution to generation quality. Guided by this analysis, we combine structured pruning, vitality-aware adaptive quantization, and lightweight finetuning to maintain fidelity under compression. Experiments on state-of-the-art 3D models, including Step1X-3D, Hunyuan3D 2.0, and Hunyuan3D 2mini, demonstrate reductions of up to 66% in model size while preserving synthesis performance. Our framework offers the potential for a plug-and-play solution to make high-quality 3D synthesis broadly accessible in resource-constrained environments.
Primary Area: generative models
Submission Number: 136
Loading