Text-to-3D Generation using Jensen-Shannon Score Distillation

Khoi Hoang Do; Binh-Son Hua

Text-to-3D Generation using Jensen-Shannon Score Distillation

Khoi Hoang Do, Binh-Son Hua

Published: 05 Nov 2025, Last Modified: 30 Jan 20263DV 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Text-to-3D Generation, Score Distilation, Jensen Shannon Divergence, Low-density Sampling

Abstract: Score distillation sampling is an effective technique to generate 3D models from text prompts, utilizing pre-trained large-scale text-to-image diffusion models as guidance. However, the produced 3D assets tend to be oversaturated, over-smoothed, and have limited diversity. These issues are a result of a reverse Kullback–Leibler (KL) divergence objective, which makes the optimization unstable and results in mode-seeking behavior. In this paper, we derive a bounded score distillation objective based on Jensen-Shannon divergence (JSD), which stabilizes the optimization process and produces high-quality 3D generation. JSD can match the generated and target distributions well, therefore mitigating mode seeking. We provide a practical implementation of JSD by utilizing the theory of generative adversarial networks to define an approximate objective function for the generator, assuming the discriminator is well-trained. By assuming the discriminator follows a log-odds classifier, we propose a minority sampling algorithm to estimate the gradients of our proposed objective, providing a practical implementation for JSD. We conduct both theoretical and empirical studies to validate our method. Experimental results on T3Bench demonstrate that our method can produce high-quality and diversified 3D assets.

Supplementary Material: pdf

Submission Number: 38

Loading