SAT3D: Image-driven Semantic Attribute Transfer in 3D

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: GAN-based image editing task aims at manipulating image attributes in the latent space of generative models. Most of the previous 2D and 3D-aware approaches mainly focus on editing attributes in images with ambiguous semantics or regions from a reference image, which fail to achieve photographic semantic attribute transfer, such as the beard from a photo of a man. In this paper, we propose an image-driven Semantic Attribute Transfer method in 3D (SAT3D) by editing semantic attributes from a reference image. For the proposed method, the exploration is conducted in the style space of a pre-trained 3D-aware StyleGAN-based generator by learning the correlations between semantic attributes and style code channels. For guidance, we associate each attribute with a set of phrase-based descriptor groups, and develop a Quantitative Measurement Module (QMM) to quantitatively describe the attribute characteristics in images based on descriptor groups, which leverages the image-text comprehension capability of CLIP. During the training process, the QMM is incorporated into attribute losses to calculate attribute similarity between images, guiding target semantic transferring and irrelevant semantics preserving. We present our 3D-aware attribute transfer results across multiple domains and also conduct comparisons with classical 2D image editing methods, demonstrating the effectiveness and customizability of our SAT3D.
Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: For image editing, existing methods mainly focus on editing the attributes in image with ambiguous semantics or regions from a reference image, which fail to achieve specific semantic attribute transfer. In this work, we propose SAT3D, an image-driven semantic attribute transfer method on pre-trained 3D-aware generative models. SAT3D learns the correlations between attributes and style code channels of StyleGANv2-based generators, guided by our designed attribute transfer and preservation losses. To the best of our knowledge, SAT3D is the first proposed solution to the novel task of image-driven semantic attribute transfer in 3D, providing high-quality migration results. Readers in the field of Generative Multimedia and Multimedia Applications are likely to be very interested in my article, which is in an area of interest to multimedia/multimodal processing and a hot topic in computer vision research.
Supplementary Material: zip
Submission Number: 2290
Loading