Effortless Event-Augmented Latent Diffusion for Video Frame Interpolation

Guixu Lin; Yuyang Yu; Xiang Ji; Linyao Chen; Zhengwei Yin; Mengshun Hu; Mingdeng Cao; Shengfeng He; Yinqiang Zheng

Effortless Event-Augmented Latent Diffusion for Video Frame Interpolation

Guixu Lin, Yuyang Yu, Xiang Ji, Linyao Chen, Zhengwei Yin, Mengshun Hu, Mingdeng Cao, Shengfeng He, Yinqiang Zheng

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: video diffusion model, video frame interpolation, event camera

Abstract: Latent Diffusion Models have advanced video frame interpolation by generating intermediate frames between input frames. However, effectively handling large temporal gaps and complex motion remains challenging, often leading to artifacts. We argue that event camera signals, with their ability to capture continuous motion at high temporal resolutions, are ideal for bridging these temporal gaps and enhancing interpolation precision. Given the impracticality of training an event-assisted model from scratch, we introduce a novel adapter-based framework that seamlessly and effortlessly integrates high-temporal-resolution cues from event cameras into pre-trained image-to-video models without modifying their underlying structure. Our method leverages Image Warped Events (IWEs) and bidirectional sparse optical flow for precise spatial and temporal alignment, significantly reducing artifacts and improving interpolation quality. Experimental results demonstrate that our event-enhanced interpolation achieves superior accuracy and temporal coherence compared to existing state-of-the-art methods.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 6704

Loading