TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning

15 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeX
Keywords: BEV, 3D Object Detection
Abstract: To achieve accurate multi-view 3D object detection, existing methods propose to benefit camera-based detectors with spatial cues provided by the LiDAR modality, e.g., depth supervision and bird-eye-view (BEV) feature distillation. However, they employ a direct point-to-point mimicry from LiDAR to camera, which suffers from the modality gap between 2D-3D features. In this paper, we propose the Target Inner-Geometry learning scheme to enhance camera-based BEV detectors from both depth and BEV feature by leveraging the LiDAR modality, termed as TiG-BEV. Firstly, we introduce an inner-depth supervision module to learn the low-level relative depth relations in each object. This equips camera-based detectors with a deeper understanding of object-level spatial structures. Secondly, we design an inner-feature BEV distillation module to imitate the high-level semantics of different keypoints within foreground targets. To further alleviate the domain gap between two modalities, we incorporate both inter-channel and interkeypoint distillation to model feature similarity. With our target inner-geometry learning, TiG-BEV effectively boosts BEVDepth by +2.3% NDS on nuScenes val set, and achieves leading performance with 61.9% NDS on nuScenes leaderboard.
Supplementary Material: pdf
Primary Area: applications to robotics, autonomy, planning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 371
Loading