MakeupAnyone: Self-Supervised Identity-Preserving MakeUp Transfer with Region-Aware Multi-Scale Alignment
Keywords: MakeupTransfer, Self-Supervised Learning, Diffusion Models
TL;DR: MakeupAnyone is an innovative diffusion-based makeup transfer framework
Abstract: Existing makeup transfer methods often fail in real-world scenarios, as the scarcity of high-quality paired data leads to model overfitting and unstable style reproduction, while their poor decoupling of identity from style results in facial distortion and poor identity consistency.
To address these challenges, we propose MakeupAnyone, a method that achieves fine-grained, high-fidelity makeup transfer through self-supervised data augmentation and region-aware multiscale alignment.
To overcome the lack of paired data, we introduce a self-supervised pipeline that leverages the powerful priors of large Vision Language Models (VLMs) and instruction-guided image editing models for data augmentation and then conducts data filtering based on facial structure consistency, aesthetic quality, and image-text consistency to produce pseudo-makeup pairs with high quality and diversity.
Furthermore, we propose a Region-Aware Multi-Scale Alignment approach for makeup feature extraction and training. Specifically, we utilize two distinct Makeup Encoders to respectively capture multi-scale global semantic features and local regional style features. These features are then intelligently fused via an adaptive fusion module. The training is guided by a composite loss function that explicitly balances global style fidelity, local detail accuracy, and identity consistency across facial components
Extensive experiments on Makeup Transfer and Makeup-Wild datasets and our newly curated dataset demonstrate that MakeupAnyone achieves state-of-the-art performance with improved detail fidelity and identity similarity.
Primary Area: generative models
Submission Number: 7935
Loading