Lifelong Learning of Video Diffusion Models From a Single Video Stream

24 Jan 2025 (modified: 18 Jun 2025)Submitted to ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We show that training autoregressive video diffusion models from a single, continuous video stream can be competitive with offline training given the same number of gradient steps and introduce three new datasets.
Abstract: This work demonstrates that training autoregressive video diffusion models from a single, continuous video stream is not only possible but can also be as effective as standard offline training approaches given the same number of gradient steps. Our demonstration further reveals that this main result can be achieved using experience replay that only retains a subset of the preceding video stream. We also contribute three new single video generative modeling datasets suitable for evaluating lifelong video model learning: Lifelong Bouncing Balls, Lifelong 3D Maze, and Lifelong PLAICraft. Each dataset contains over a million consecutive frames from an environment of increasing complexity.
Primary Area: Deep Learning
Keywords: Lifelong Learning, Continual Learning, Diffusion Models, Video Generation
Submission Number: 15642
Loading