Pixel-Perfect Puppetry: Precision-Guided Enhancement for Face Image and Video Editing

Yan Li; Zhenyi Wang; Guanghao Li; Wei Xue; Wenhan Luo; Yike Guo

Pixel-Perfect Puppetry: Precision-Guided Enhancement for Face Image and Video Editing

Yan Li, Zhenyi Wang, Guanghao Li, Wei Xue, Wenhan Luo, Yike Guo

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: FaceVideo Editing, Face Image Editing, Precision Guidance

Abstract: Preserving identity while precisely manipulating attributes is a central challenge in face editing for both images and videos. Existing methods often introduce visual artifacts or fail to maintain temporal consistency. We present **FlowGuide**, a unified framework that achieves fine-grained control over face editing in diffusion models. Our approach is founded on the local linearity of the UNet bottleneck’s latent space, which allows us to treat semantic attributes as corresponding to specific linear subspaces, providing a mathematically sound basis for disentanglement. FlowGuide first identifies a set of orthogonal basis vectors that span these semantic subspaces for both the original content and the target edit, a representation that efficiently captures the most salient features of each. We then introduce a novel guidance mechanism that quantifies the geometric alignment between these bases to dynamically steer the denoising trajectory at each step. This approach offers superior control by ensuring edits are confined to the desired attribute’s semantic axis while preserving orthogonal components related to identity. Extensive experiments demonstrate that FlowGuide achieves state-of-the-art performance, producing high-quality edits with superior identity preservation and temporal coherence. Our code is available at: https://github.com/yl4467/flow_edit.

Primary Area: generative models

Submission Number: 3400

Loading