FluIDWorld: Fluid-like Interactive Dynamics for 4D Worlds

Published: 02 Mar 2026, Last Modified: 15 Apr 2026ICLR 2026 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: World generation, Dynamic scene, Interactive 4D scene generation
Abstract: Recent advances in generative models have enabled the construction of large-scale, controllable, and realistic 3D scenes and videos. However, these approaches typically produce static scenes or coherent 2D sequences, without maintaining an explicit world state with editable dynamics. In this paper, we propose FluIDWorld, an interactive framework for constructing coherent 4D worlds from a single image, designed to support real-time observation and control of fluid-like dynamics. To achieve stable and controllable dynamics during continuous world expansion, it is crucial to obtain reliable motion estimates for newly revealed regions while maintaining consistency with the existing world state. To this end, FluIDWorld incrementally estimates view-grounded motion, aligns each estimate into a consistent global frame via fast geometric alignment, and updates a compact Eulerian velocity field to preserve temporal coherence. This design enables memory-efficient and scalable 4D world generation with low latency, allowing a static scene to be expanded into a temporally coherent 4D world in just 8 seconds on a single GPU, while supporting intuitive motion editing and real-time user feedback.
Submission Number: 54
Loading