Keywords: Sketch-to-Face Synthesis, Diffusion Models, Conditional Inpainting, ControlNet
Abstract: Generating realistic faces from monochromatic sketches is challenging due to missing details like expressions and skin tones. GANs struggle with stability and structure, while diffusion models face issues with monochrome inputs and high costs. DSFace, a latent diffusion-based conditional inpainting framework, addresses these challenges using a frozen Paint-by-Example diffusion model with ControlNet conditioning and DINO-V2 embeddings from a GAN-generated coarse image. Trained on the CUFS dataset, DSFace achieves superior realism, perceptual quality, and structural alignment.
Serve As Reviewer: ~Sanhita_Pathak1
Submission Number: 9
Loading