3D Gaussian Representations in Semantic Occupancy Prediction: A Comprehensive Survey and Analysis

Published: 22 Oct 2025, Last Modified: 12 Nov 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: Gaussian splatting and semantic occupancy pre- diction for autonomous perception share the common goal of achieving accurate 3D scene understanding. Emerging within a similar time frame, the use of 3D Gaussians in occupancy models has gained considerable attention. These models have demonstrated benefits such as reduced latency and a more memory-efficient scene representation compared to dense vox- els, with results comparable to previous state-of-the-art ap- proaches. This holds despite the differences between continuous and discrete scene representations when compared with voxel- based methods. This growing area of research builds upon work in Bird’s Eye View perception and voxel-driven semantic occupancy prediction, both of which have been the subject of major surveys that outline their development to the present day. In this survey, Gaussian-driven semantic occupancy prediction models are analysed in detail, with particular emphasis on their contributions to the field, design choices, multi-modal research directions, and supervision strategies. In addition, a detailed results table for these models across three notable datasets is compiled to better understand the influence of specific design decisions. We also investigate the distribution of Gaussian parameters at run time to determine whether redundancy exists within these models. Finally, the limitations and future directions of this line of research are explored, providing a clearer view of the strengths and weaknesses of 3D Gaussians in occupancy prediction. In particular, we conclude that the use of Gaussians in occupancy prediction models is positioned to complement voxel-driven methods, namely through Gaussian- based rendering loss for the enforcement of temporal and view consistency across the surround-view cameras.
Loading