NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Yuxue Yang; Lue Fan; Ziqi Shi; Junran Peng; Feng Wang; Zhaoxiang Zhang

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Yuxue Yang, Lue Fan, Ziqi Shi, Junran Peng, Feng Wang, Zhaoxiang Zhang

Published: 10 Jun 2026, Last Modified: 10 Jun 2026CVPR 2026 Workshop VideoWorldModel PosterEveryoneRevisionsCC BY 4.0

Keywords: Feed-forward Reconstruction, 4D Gaussian Splatting, Video Generation, Video World Model

TL;DR: NeoVerse is a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications.

Abstract: In this paper, we propose **NeoVerse**, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 8

Loading