DeepPointMap2: Accurate and Robust LiDAR-Visual SLAM with Neural Descriptors

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Simultaneous Localization and Mapping (SLAM) plays a pivotal role in autonomous driving and robotics. Given the complexity of road environments, there is a growing research emphasis on developing robust and accurate multi-modal SLAM systems. Existing methods often rely on hand-craft feature extraction and cross-modal fusion techniques, resulting in limited feature representation capability and reduced flexibility and robustness. To address this challenge, we introduce DeepPointMap2, a novel learning-based LiDAR-Visual SLAM architecture that leverages neural descriptors to tackle multiple SLAM subtasks in a unified manner. Our approach employs neural networks to extract multi-modal feature tokens, which are then adaptively fused by the Visual-Point Fusion Module to generate sparse neural 3D descriptors, ensuring precise localization and robust performance. As a pioneering work, our method achieves state-of-the-art localization performance among various Visual-based, LiDAR-based, and Visual-LiDAR-based methods in widely used benchmarks, as shown in the experiment results. Furthermore, the approach proves to be robust in scenarios involving camera failure and LiDAR obstruction.
Primary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: This work significantly contributes to multimodal processing by introducing an advanced multi-modal SLAM (Simultaneous Localization and Mapping) architecture, DeepPointMap2, which effectively integrates and processes data from multiple sensor modalities, specifically LiDAR and visual inputs. DeepPointMap2 utilizes neural networks for feature extraction and fusion, effectively addressing the challenges of complex environments in autonomous driving and robotics. Experimental results demonstrate its state-of-the-art performance in benchmarks, with improved localization accuracy and robustness. This highlights the advancements in multi-modal SLAM methodologies, particularly in robustness and accuracy in challenging environments.
Supplementary Material: zip
Submission Number: 4475
Loading