Abstract: 3D object detection from a Bird’s Eye View (BEV) has emerged as a novel perception paradigm for autonomous driving scenarios. While most current 3D object detection methods still rely on the conventional Cartesian coordinates, they fail to align with the non-aligned coordinate system inherent in image geometry. The Polar coordinates, on the other hand, better fit with the geometric shape corresponding to the perception of cameras. However, transforming between coordinate systems introduces distortions in the perception information, resulting in issues such as “Weak Adaptability to Heatmap Distribution” and “Offset in the Center Point of the Bounding Box.” To address these challenges, this paper proposes a cutting-edge 3D object detection model named PolarBEVU, which leverages the bird’s-eye view under the Polar coordinates along with multi-camera unprojection. The model introduces an innovative “Deformable Uniform Heatmap Distribution” method that adjusts heatmap computations based on box shapes, generating high-quality heatmaps and effectively resolving the issue of “Weak Adaptability to Heatmap Distribution.” Moreover, the model incorporates the concept of “Dynamic High-risk Regression Region” to enhance the accuracy and robustness of the center point regression at the bounding box, thus mitigating the issue of “Offset in the Center Point of the Bounding Box.” In extensive experiments on the nuScenes dataset, PolarBEVU achieves impressive results with 49.9% mAP and 57.4% NDS on the test set, surpassing other comparative approaches and reaching the state-of-the-art (SOTA) performance among methods utilizing Polar coordinates. This clearly demonstrates the efficacy and superiority of PolarBEVU. In addition, the model is successfully deployed on Nvidia Jetson AGX Orin, showcasing real-time inference speeds of 31.42ms. These findings affirm PolarBEVU’s potential for practical applications. Code is available at https://github.com/JLUrob/PolarBEVU.
External IDs:dblp:journals/tcsv/HouLWMXHF25
Loading