Improved latent diffusion-based IC-DGAN framework for high-resolution multi-feature and expression manipulation

Published: 01 Jan 2026, Last Modified: 14 Nov 2025Neural Networks 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Facial expression and multi-feature manipulation play a vital role in applications such as media entertainment and biometric forensics. However, existing approaches face significant challenges, including semantic inconsistency, sensitivity to pose and illumination variations, and high computational demands. To address these challenges, this study proposes an improved latent diffusion-based deep generative adversarial network (IC-DGAN) framework that integrates multiple generators and discriminators, K-means clustering, and constructive pre-training to achieve precise semantic multi-feature and facial expression manipulation. The framework leverages scale-invariant feature transform (SIFT) and latent diffusion models to autonomously disentangle and manipulate facial attributes, enabling synchronized decomposition across multiple levels and generating high-resolution, realistic portraits. By mapping facial portraits back to the latent space, IC-DGAN enables robust attribute editing—including age, gender, and expression—while minimizing visual distortions. Comprehensive evaluations on benchmark datasets, including CelebA-HQ, CAS-PEAL, and RafD, demonstrate that IC-DGAN outperforms state-of-the-art methods, reducing unintended portrait variations by 12.3 %, enhancing manipulation accuracy by 8.7 %, and achieving a Fréchet Inception Distance (FID) of 25.94—significantly surpassing existing benchmarks. These results underscore the framework’s potential for advancing high-fidelity facial editing, offering a robust solution to longstanding challenges.
Loading