LOSC: LiDAR Open-voc Segmentation Consolidator

Published: 05 Nov 2025, Last Modified: 30 Jan 20263DV 2026 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LiDAR segmentation, LiDAR semantic segmentation, LiDAR panoptic segmentation, 3D scene understanding, VLM, open-vocabulary, zero-shot
TL;DR: A simple, annotation-free pipeline that achieves SOTA semantic & panoptic LiDAR segmentation on nuScenes and SemanticKITTI, outperforming even image-dependent methods.
Abstract: We study the use of image-based Vision-Language Models (VLMs) for open-vocabulary segmentation of lidar scans in driving settings. Classically, image semantics can be back-projected onto 3D point clouds. Yet, resulting point labels are noisy and sparse. We consolidate these labels to enforce both spatio-temporal consistency and robustness to image-level augmentations. We then train a 3D network based on these refined labels. This simple method, called LOSC, outperforms the SOTA of zero-shot open-vocabulary semantic and panoptic segmentation on both nuScenes and SemanticKITTI, with significant margins. Code is available at \url{https://github.com/valeoai/LOSC}.
Supplementary Material: pdf
Submission Number: 86
Loading