DiTNet: End-to-End 3D Object Detection and Track ID Assignment in Spatio-Temporal World

Sukai Wang, Peide Cai, Lujia Wang, Ming Liu

2021 (modified: 16 Nov 2022)IEEE Robotics Autom. Lett. 2021Readers: Everyone

Abstract: End-to-end 3D object detection and tracking based on point clouds is receiving more and more attention in many robotics applications, such as autonomous driving. Compared with 2D images, 3D point clouds do not have enough texture information for data association. Thus, we propose an end-to-end point cloud-based network, DiTNet, to directly assign a track ID to each object across the whole sequence, without the data association step. DiTNet is made location-invariant by using relative location and embeddings to learn each object's spatial and temporal features in the Spatio-temporal world. The features from the detection module helps to improve the tracking performance, and the tracking module with final trajectories also helps to refine the detection results. We train and evaluate our network on the CARLA simulation environment and KITTI dataset. Our approach achieves competitive performance over the state-of-the-art methods on the KITTI benchmark.

0 Replies