Abstract: This paper presents InvLatents, a novel framework for character animation that leverages latent inversion diffusion models to ensure consistent identity preservation across frames. Existing diffusion-based character animation methods often struggle with maintaining identity consistency due to the inherent randomness in the generation process. To address this issue, InvLatents introduces a latent inversion technique that incorporates target identity and pose guidance into the inference stage. By controlling different injection ratios in different branches, the method obtains richer identity information from the reference image. Additionally, a lightweight pose integration module is introduced to compensate for potential missing pose guidance. Experimental results on the TikTok dataset demonstrate that InvLatents achieves competitive performance compared to state-of-the-art approaches, effectively maintaining both identity and pose consistency without requiring additional training. The proposed method can be integrated as a plugin into other diffusion models, offering a promising solution for generating temporally coherent motion videos with consistent identity. Project page: https://github.com/SodaLee/InvLatents.
External IDs:dblp:journals/vc/LiTWCL25
Loading