Keywords: 3D generative models, diffusion models
TL;DR: We propose a compression framework that reduces 3D generative model size while preserving quality. By leveraging Transformer layer vitality, we combine pruning, adaptive quantization, and finetuning, enabling efficient 3D generation.
Abstract: We propose the first compression framework for image-to-3D generative models that substantially reduces model size while preserving synthesis fidelity.
Recent advances in 3D shape generative modeling, particularly Diffusion Transformer (DiT) architectures, have achieved remarkable progress in synthesis fidelity and controllability.
However, the substantial computational cost of large DiT-based image-to-3D models hinders their practical application in resource-constrained settings.
While existing efficiency-oriented approaches improve inference speed, they leave the underlying model size and computational cost of synthesis largely unchanged.
To address this challenge, we propose a systematic compression framework that physically reduces model size while preserving the fidelity of 3D shape synthesis.
Our approach builds on the observation that Transformer layers in 3D DiT models exhibit non-uniform importance, with only a subset of layers contributing significantly to geometry generation.
Leveraging this insight, we introduce a vitality-guided framework that integrates structured pruning, adaptive quantization, and targeted fine-tuning to balance efficiency and quality.
Experimental results show that our method achieves up to 66% model-size reduction across state-of-the-art 3D generative models with minimal loss in synthesis fidelity.
This highlights the potential of our framework as a plug-and-play solution for efficient 3D shape generation across diverse models.
Primary Area: generative models
Submission Number: 136
Loading