Abstract: Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar. However, most existing GAN-based translation methods fail to produce photorealistic results. In this study, we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in style. The proposed method trains a conditional denoising diffusion probabilistic model (DDPM) with a SPADE module to integrate the semantic map. We then used a novel contextual loss and auxiliary color loss to guide the optimization process, resulting in images that were visually pleasing and semantically accurate. Experiments demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics.
Loading