BERT-Enhanced with Context-Aware Embedding for Instance Segmentation in 3D Point Clouds

Hongxin Yang, Hailun Yan, Ruisheng Wang

2022 (modified: 27 Oct 2022)IGARSS 2022Readers: Everyone

Abstract: Inspired by the successful implementation of transformer network in the Natural Language Processing (NLP), we propose a novel Bidirectional Encoder Representations from Transformers (BERT)-based point cloud segmentation method. Specifically, the whole point cloud is scanned by multiple overlapping windows. We made the first attempt ever to input each window-point-cloud into the BERT model which outputs points' semantic labels and high-dimensional context-aware point embeddings. In the process of training, the Kullback-Leibler (KL)-Divergence-based clustering loss is utilized to optimize the network's parameters by calculating similarity matrices between the point embeddings and the predicted semantic labels. The final instance labels can be obtained by softmax function on these optimized point embeddings. By evaluating on the Stanford 3D Indoor Scene (S3DIS) dataset, our proposed method has reached a micro-mean accuracy (mAcc) of 87.3% on the semantic segmentation task and an Average Precision (mAP) on the instance segmentation task. The results on both tasks have surpassed the traditional point cloud segmentation models.

0 Replies