Object-level Data Augmentation for Visual 3D Object Detection in Autonomous Driving

Published: 01 Jan 2025, Last Modified: 01 Aug 2025ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Data augmentation plays an important role in visual-based 3D object detection. Existing detectors typically employ image/BEV-level data augmentation techniques, failing to utilize flexible object-level augmentations because of 2D-3D inconsistencies. This limitation hinders us from increasing the diversity of training data. To alleviate this issue, we propose an object-level data augmentation approach that incorporates scene reconstruction and neural scene rendering. Specifically, we reconstruct the scene and objects by extracting image features from sequences and aligning them with associated LiDAR point clouds. This approach is intended to conduct the editing process within a 3D space, allowing for flexible object manipulation. Additionally, we introduce a neural scene renderer to project the edited 3D scene onto a specified camera plane and render it onto a 2D image. Combined with scene reconstruction, it overcomes the challenges stemming from 2D/3D inconsistencies, enabling the generation of object-level augmented images with corresponding labels for model training. To validate the proposed method, we apply our method to various multi-camera 3D object detectors, consistently boosting the performance.
Loading