G²-Mapping: General Gaussian Mapping for Monocular, RGB-D, and LiDAR-Inertial-Visual Systems

Lin Chen, Boni Hu, Jvboxi Wang, Shuhui Bu, Guangming Wang, Pengcheng Han, Jian Chen

Published: 01 Jan 2025, Last Modified: 15 Sept 2025IEEE Trans Autom. Sci. Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we introduce G2-Mapping, a novel method to comprehensively support online monocular, RGB-D, and LiDAR-Inertial-Visual systems, employing 3D gaussian points as scene representation. There are several issues when applying 3d gaussian splatting (3DGS) techniques to simultaneous localization and mapping (SLAM) 1) for monocular, the lack of depth information makes scene initialization difficult and large baseline positioning challenging; 2) differentiable rendering with respect to depth and pose has not been implemented in 3DGS, making it difficult to directly apply to the SLAM system; 3) strategy for updating the scene with incoming online frames is not present, which may lead to memory overflow. In order to overcome problems mentioned above, we formulate a mathematical derivation and propose a differentiable rendering approach that leverages both depth and color to optimize the scene and pose. We introduce a simplified odometry that provides a metric depth estimation for monocular and enhance the low-overlap scene availability. A scale consistency and uncertainty weighted optimization is further proposed to eliminates the impact of inaccurate depth prediction. Our proposed scene updating strategy effectively prevents rapid memory growth. Tracking and mapping are performed alternatively to achieve precise localization and synchronous high-fidelity map reconstruction. Extensive experiments demonstrate that our G2-Mapping surpasses feature-based SLAM in localization precision and exceeds state-of-the-art neural SLAM methods in the fidelity of view synthesis. Note to Practitioners—This paper is dedicated to tackling the efficiency challenges in multi-source SLAM and map reconstruction, aiming to generate pose and high-fidelity maps synchronously. We introduce G2-Mapping, a novel framework that leverages the power of 3D Gaussian points for scene representation, offering a universal solution for monocular, RGB-D, and LiDAR-Inertial-Visual systems. By developing a comprehensive differentiable renderer and presenting a strategy for dynamic scene updating, G2-Mapping significantly advances the state-of-the-art in localization precision and view synthesis fidelity. Although the approach is highly promising, it currently relies on the accuracy of depth prediction networks and requires further optimization for handling sparse LiDAR data. Future research will focus on enhancing these aspects, aiming to seamlessly integrate G2-Mapping into practical applications within robotics, autonomous vehicles, and augmented reality, where robust and efficient SLAM solutions are paramount.

External IDs:dblp:journals/tase/ChenHWBWHC25