Abstract: Recently, since diffusion models show great potential in image generation, many pretrained diffusion models based image composition methods have been proposed for image illumination harmonization. However, they mainly face two key challenges: 1) the effective preservation of foreground appearance (i.e., content structure and texture details, etc); 2) Reasonable generation of the foreground casting shadow. To this end, we propose a novel Image Illumination Harmonization Diffusion model called I 2 HDiffuser to achieve image illumination harmonization with high-fidelity foreground appearance and reasonable cast shadows. I 2 HDiffuser mainly consists of frequency domain feature enhancement branch (FDFEB) and illumination-shadow consistency generation branch (ISCGB). Specifically, FDFEB first introduces the Wavelet Transform Module (WTM) for decomposing composite image features into low-frequency (i.e., illumination features, etc) and high-frequency (i.e., texture and content structure features, etc) components using the Haar wavelet transform. Then the Multi-Condition Guidance Mechanism (M-CGM) is proposed to interact these components as prior conditions, which are further injected into the ISCGB with a noise-to-denoise process for guiding high-fidelity content and background illumination-aware foreground regeneration. Meanwhile, a shadow mask step-wise iterative optimization strategy is introduced to the ISCGB to explicitly provide a reasonable shadow generation space for foreground objects. Extensive experiments on public image harmonization datasets DESOBAv2 and iHarmony4 and real illumination harmonization dataset IH-SG show that the I 2HDiffuser achieves the superiority.
External IDs:doi:10.1145/3746027.3755314
Loading