Unsupervised Pre-training for 3D Object Detection with Transformer

Maosheng Sun, Xiaoshui Huang, Zeren Sun, Qiong Wang, Yazhou Yao

2022 (modified: 22 Nov 2022)PRCV (3) 2022Readers: Everyone

Abstract: Transformer improve the performance of 3D object detection with few hyperparameters. Inspired by the recent success of the pre-training Transformer in 2D object detection and natural language processing, we propose a pretext task named random block detection to unsupervisedly pre-train 3DETR (UP3DETR). Specifically, we sample random blocks from original point clouds and feed them into the Transformer decoder. Then, the whole Transformer is trained by detecting the locations of these blocks. The pretext task can pre-train the Transformer-based 3D object detector without any manual annotations. In our experiments, UP3DETR performs 6.2 $$\%$$ better than 3DETR baseline on challenging ScanNetV2 datasets and has a faster convergence speed on object detection tasks.

0 Replies