DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization

Zhenglin Zhou; Xiaobo Xia; Fan Ma; Hehe Fan; Yi Yang; Tat-Seng Chua

DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization

Zhenglin Zhou, Xiaobo Xia, Fan Ma, Hehe Fan, Yi Yang, Tat-Seng Chua

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: DreamDPO is a framework that incorporates human preferences into text-to-3D generation, achieving state-of-the-art results with improved quality and fine-grained controllability.

Abstract: Text-to-3D generation automates 3D content creation from textual descriptions, which offers transformative potential across various fields. However, existing methods often struggle to align generated content with human preferences, limiting their applicability and flexibility. To address these limitations, in this paper, we propose DreamDPO, an optimization-based framework that integrates human preferences into the 3D generation process, through direct preference optimization. Practically, DreamDPO first constructs pairwise examples, then validates their alignment with human preferences using reward or large multimodal models, and lastly optimizes the 3D representation with a preference-driven loss function. By leveraging relative preferences, DreamDPO reduces reliance on precise quality evaluations while enabling fine-grained controllability through preference-guided optimization. Experiments demonstrate that DreamDPO achieves state-of-the-art results, and provides higher-quality and more controllable 3D content compared to existing methods. The code and models will be open-sourced.

Lay Summary: Creating 3D objects from text prompt — like turning “a rough rock” into a 3D model — has promising applications in gaming, design, and virtual reality. However, current methods often fail to match human expectations. We propose DreamDPO, an optimization-based framework that aligns 3D generation with human preferences. Instead of relying on strict quality scores, DreamDPO learns from pairwise comparisons — similar to choosing which of two versions looks better. This feedback helps produce more realistic, customizable, and appealing 3D models. By prioritizing human judgment over rigid metrics, DreamDPO enables more flexible and accurate 3D creation. Experiments show it outperforms existing methods, delivering higher-quality results and user control.

Link To Code: https://github.com/ZhenglinZhou/DreamDPO

Primary Area: Applications->Computer Vision

Keywords: 3D Generation, Human Preference, Direct Preference Optimization

Submission Number: 1732

Loading