Keywords: portrait animation, diffusion model
Abstract: In this paper, we present ***FaceShot***, a novel training-free portrait animation framework designed to bring any character into life from any driven video without fine-tuning or retraining.
We achieve this by offering precise and robust reposed landmark sequences from an appearance-guided landmark matching module and a coordinate-based landmark retargeting module.
Together, these components harness the robust semantic correspondences of latent diffusion models to produce facial motion sequence across a wide range of character types.
After that, we input the landmark sequences into a pre-trained landmark-driven animation model to generate animated video.
With this powerful generalization capability, FaceShot can significantly extend the application of portrait animation by breaking the limitation of realistic portrait landmark detection for any stylized character and driven video.
Also, FaceShot is compatible with any landmark-driven animation model, significantly improving overall performance.
Extensive experiments on our newly constructed character benchmark CharacBench confirm that FaceShot consistently surpasses state-of-the-art (SOTA) approaches across any character domain.
More results are available at our project website https://faceshot2024.github.io/faceshot/.
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 387
Loading