MultiTune: Phase-Aware Multi-Objective Optimization for Diffusion Models

Renye Yan; Jikang Cheng; Shikun Sun; Yi Sun; Wei Peng; Yaozhong Gan; You Wu; Ling Liang; Junliang Xing; Yimao Cai

MultiTune: Phase-Aware Multi-Objective Optimization for Diffusion Models

Renye Yan, Jikang Cheng, Shikun Sun, Yi Sun, Wei Peng, Yaozhong Gan, You Wu, Ling Liang, Junliang Xing, Yimao Cai

08 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Model；Text-to- Image

Abstract: Diffusion models excel at basic text-to-image but struggle to align with specific objectives. While reinforcement learning offers a promising solution, single-reward setups often lead to overfitting. To this end, multi-objective optimization methods are proposed. However, such methods face challenges of goal conflicts, inflexible reward fusion, and low efficiency, hindering overall performance across diverse criteria. To address these challenges, we propose MultiTune, a lightweight multi-objective framework tailored to the diffusion process. We decompose the optimization targets into Phase and Main objectives, where the former involves multiple phases of stepwise guidance and the latter ensures overall convergence. We first introduce a phase-aware switching strategy that aligns with the structural-to-textural evolution in diffusion, enabling dynamic and decoupled scheduling of Phase Objectives. Then, we adaptively balance the Phase and Main Objectives based on variations in image quality for on-demand collaboration. Experiments demonstrate that MultiTune outperforms SOTA methods in aesthetics, semantics, details, and style, achieving leading performance across five quantitative metrics.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 3017

Loading