Abstract: Highlights•We propose to use a latent diffusion model to edit gaze-related features and generate high-quality data with wide coverage.•We propose a training pipeline for disentanglement of gaze, head posture, and identity information embedded in latent space.•We design encoder and decoder structures for aligning vector projection and embedding.•Relative gaze is used to eliminate differences in anatomical structure and achieve gaze-invariant head rotation.
External IDs:dblp:journals/eaai/HuCH25
Loading