ColVO: Colonoscopic Visual Odometry Considering Geometric and Photometric Consistency

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Locating lesions is the primary goal of colonoscopy examinations. 3D perception techniques can enhance the accuracy of lesion localization by restoring 3D spatial information of the colon. However, existing methods focus on the local depth estimation of a single frame and neglect the precise global positioning of the colonoscope, thus failing to provide the accurate 3D location of lesions. The root causes of this shortfall are twofold: Firstly, existing methods treat colon depth and colonoscope pose estimation as independent tasks or design them as parallel sub-task branches. Secondly, the light source in the colon environment moves with the colonoscope, leading to brightness fluctuations among continuous frame images. To address these two issues, we propose ColVO, a novel deep learning-based Visual Odometry framework, which can continuously estimate colon depth and colonoscopic pose using two key components: a deep couple strategy for depth and pose estimation (DCDP) and a light consistent calibration mechanism (LCC). DCDP utilization of multimodal fusion and loss function constraints to couple depth and pose estimation modes ensures seamless alignment of geometric projections between consecutive frames. Meanwhile, LCC accounts for brightness variations by recalibrating the luminosity values of adjacent frames, enhancing ColVO's robustness. A comprehensive evaluation of ColVO on colon odometry benchmarks reveals its superiority over state-of-the-art methods in depth and pose estimation. We also demonstrate two valuable applications: immediate polyp localization and complete 3D reconstruction of the intestine. The code for ColVO is available at https://github.com/xxx/xxx.
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: During 2D colonoscopy, doctors often struggle to accurately locate the spatial position of lesions. Our proposed ColVO framework aims to perceive 3D intestinal information by simultaneously estimating colon depth and colonoscope pose by fusing various modalities. The proposed DCDP strategy in ColVO enhances the performance of colonoscope pose estimation by fusing cross-modal RGB and inferred depth features. Moreover, the proposed LCC module enhances the model's robustness and accuracy in depth and pose estimation under nonuniform illumination. A comprehensive evaluation of ColVO on colon odometry benchmarks reveals its superiority over state-of-the-art methods in depth and pose estimation. The proposed ColVO can be applied in highly challenging colon environments, providing valuable clinical auxiliary diagnostic information including lesion localization and 3D colon modeling for colonoscopy.
Submission Number: 3239
Loading