LinkOcc: 3D Semantic Occupancy Prediction With Temporal Association

Published: 01 Jan 2025, Last Modified: 13 May 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: 3D semantic occupancy has garnered considerable attention due to its abundant structural information encompassing the entire autonomous driving scene. However, existing 3D occupancy prediction methods are typically tailored for single-frame inputs, resulting in unsatisfactory performance and temporal inconsistencies in real-world continuous scenarios. In this paper, we introduce LinkOcc, a sparse-queries approach incorporating an efficient temporal association mechanism for 3D semantic occupancy prediction. LinkOcc is conceptually built on the prevalent DETR-like framework for 2D segmentation, and we further construct the temporal association mechanism on this basis. Specifically, we propose a near-online training strategy that jointly trains with two adjacent frames, which successfully combines the benefits of both online and off-online methods. Moreover, we introduce a temporal association strategy with contrastive learning to discriminate features for cross-frame semantic-level association. Comprehensive experiments demonstrate that LinkOcc not only surpasses the state-of-the-art methods in 3D occupancy prediction, but also guarantees a promising performance on foreground classes.
Loading