DTSSD: Dual-Channel Transformer-Based Network for Point-Based 3D Object Detection

Published: 2023, Last Modified: 05 Mar 2025IEEE Signal Process. Lett. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the field of 3D object detection, previous methods mainly utilize one channel feature encoding network to extract point-wise features. Despite the effectiveness, we find that only leveraging one channel encoding network is not sufficient and impedes the detection performance. To this end, we propose a dual-channel transformer-based feature encoding network, which integrates both set abstraction layer and transformer block as backbone. It enables the model to exploit fine-grained as well as long-range contextual information of objects, thus providing complementary relationship of two methods. In addition, a centroid estimation module is introduced to obtain powerful representation of the whole object. Finally, considering the significance of point density, which is crucial for detection performance, we propose a central density-aware enhancement module to equip center features with distinct density features. Experimental results on KITTI dataset show the effectiveness of our proposed method.
Loading