Speech driven video editing via an audio-conditioned diffusion model

Published: 01 Jan 2024, Last Modified: 13 Nov 2024Image Vis. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Denoising diffusion models for speech driven video editing.•Present a speech-conditioned diffusion model for this task.•We demonstrate promising results on the GRID and CREMA-D datasets.•An unstructured diffusion-based approach can generate high quality image frames without complex loss function.
Loading