Keywords: Object Detection, Segmentation, Instance Segmentation, 3D, Open World, Gaussian Splatting, Crime Scene
TL;DR: Open-world vocabulary object detection in 3D Gaussian splats of crime scenes
Abstract: Enabling an open-world vocabulary object detection and segmentation in 3D scenes is a current challenge in 3D computer vision. One of the application fields that can profit from this is crime scene investigation where digital twins of felonies are increasingly common. These scenes often-times encompass a very large amount of arbitrary objects. We propose a vision language model based processing pipeline that creates a latent-space representation of the full contents of a Gaussian splat scene through DINOv3/SigLIP2 feature extraction that can be queried with open-world vocabulary to find objects. To ease computational cost, the system operates in 2D image space and then segments found objects of arbitrary size within the 3D scene. Our pipeline is designed to work on cluttered, large scenes with many details. Results and their evaluation will be presented at Northern Lights Deep Learning 2026.
Serve As Reviewer: ~Michael_Greza1, ~Florian_Eichinger1
Submission Number: 47
Loading