Abstract: Gaussian splatting and semantic occupancy pre-
diction for autonomous perception share the common goal of
achieving accurate 3D scene understanding. Emerging within
a similar time frame, the use of 3D Gaussians in occupancy
models has gained considerable attention. These models have
demonstrated benefits such as reduced latency and a more
memory-efficient scene representation compared to dense vox-
els, with results comparable to previous state-of-the-art ap-
proaches. This holds despite the differences between continuous
and discrete scene representations when compared with voxel-
based methods. This growing area of research builds upon
work in Bird’s Eye View perception and voxel-driven semantic
occupancy prediction, both of which have been the subject of
major surveys that outline their development to the present day.
In this survey, Gaussian-driven semantic occupancy prediction
models are analysed in detail, with particular emphasis on their
contributions to the field, design choices, multi-modal research
directions, and supervision strategies. In addition, a detailed
results table for these models across three notable datasets is
compiled to better understand the influence of specific design
decisions. We also investigate the distribution of Gaussian
parameters at run time to determine whether redundancy
exists within these models. Finally, the limitations and future
directions of this line of research are explored, providing a
clearer view of the strengths and weaknesses of 3D Gaussians
in occupancy prediction. In particular, we conclude that the use
of Gaussians in occupancy prediction models is positioned to
complement voxel-driven methods, namely through Gaussian-
based rendering loss for the enforcement of temporal and view
consistency across the surround-view cameras.
Loading