SPF-Portrait: Towards Pure Text-to-Portrait Customization with Semantic Pollution-Free Fine-Tuning

04 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion Model, Text-to-Image, Concept Customization
Abstract: Fine-tuning a pre-trained text-to-image (T2I) diffusion model on a tailored portrait dataset is the mainstream method for text-to-portrait customization. However, existing methods often severely impact the original model’s behavior (e.g., changes in ID, posture, layout, etc.) while customizing portrait. To address this issue, we propose SPF-Portrait, a pioneering work to achieve pure text-to-portrait customization, which necessitates direct text-conditioned personalized portrait generation and introduces differences purely through target attributes while preserving the original model's behavior before and after portrait customization. To eliminate the interference of conventional customization on the original model, SPF-Portrait designs an additional dual-path alignment stage after the standard fine-tuning. This stage introduces the pre-trained T2I diffusion model as a reference for the fine-tuned model and achieves behavioral alignment by contrastively constraining intermediate features in diffusion models between the dual paths. To accurately align target-unrelated attributes with the original behavior without affecting the effectiveness of the target response, we propose a novel Semantic-Aware Fine Control Map, which perceives the desired response region of target semantics to adaptively guide the alignment process, preventing over-alignment of the customized portrait with the original portrait. Furthermore, to improve the fidelity of target attribute, we introduce a novel target response enhancement mechanism that utilizes our proposed representation bias as a supervisory signal to mitigate the cross-modal discrepancy in direct text-image supervision, thereby reinforcing the performance of target attributes and the overall quality of the portraits. Extensive experiments demonstrate the superiority of our method.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 2169
Loading