OccVAR: Scalable 4D Occupancy Prediction via Next-Scale Prediction

19 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Autonomous driving, World model, 3D generation
TL;DR: OCCVAR: Scalable 4D Occupancy Prediction via Next-Scale Prediction
Abstract: In this paper, we propose OCCVAR, a generative occupancy world model that simulates the movement of the ego vehicle and the evolution of the surrounding environment. Different from visual generation, the occupancy world model should capture the fine-grained 3D geometry and dynamic evolution of the 3D scenes, posing great challenges for the generative models. Recent approaches based on autoregression (AR) have demonstrated the potential to predict vehicle movement and future occupancy scenes simultaneously from historical observations, but they typically suffer from the inefficiency and temporal degradation in long-time generation. To holistically address the efficiency and quality issues, we propose a spatial-temporal transformer via temporal next-scale prediction, aiming at predicting the 4D occupancy scenes from coarse to fine scales. To model the dynamic evolution of the scene, we incorporate the ego movement before the tokenized occupancy sequence, enabling the prediction of ego movement and controllable scene generation. To model the fine-grained 3D geometry, OCCVAR utilizes a muitli-scale scene tokenizer to capture the hierarchical information of the 3D scene. Experiments show that OCCVAR is capable of high-quality occupancy reconstruction, long-time generation and fast inference speed compared to prior works.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1836
Loading