Click, Crop & Detect: One-Click Offline Annotation for Human-in-the-Loop 3D Object Detection on Point Clouds

Published: 01 Jan 2024, Last Modified: 03 Jul 2025CVPR Workshops 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recent cutting-edge methods for 3D object detection on point clouds are based on supervised learning methods. As these methods demand an extreme volume of data with the highest quality to train on, cost-effective annotation plays a crucial role in developing such perception algorithms, e.g., for autonomous vehicles or robots. Every inconsistency or error between the data captured by sensors and the subsequently generated labels might degrade the potential detection performance. Nevertheless, resources for annotation are usually very limited in terms of budget and time. We propose a straightforward yet highly effective technique called Click, Crop, and Detect (CCD) to address this issue. The core concept of CCD involves leveraging human input first to generate a prior rough localization of each object and employing 3D object detectors on a simplified cropped region of interest. We evaluate CCD across popular detectors such as PointPillars, CenterPoint, and TED on nuScenes and KITTI. Here, we show that only marginal changes to existing off-the-shelf detectors are required to make them compatible. Our method consistently outperforms state-of- the-art one-click detectors by 7.89% and 10.45% for cars and pedestrians, respectively, while being much more robust and precise on challenging, sparse inputs. This heavily increases label quality and efficiency when applied for semi-automated ground truth annotation.
Loading