RVO-MIS: Robust Visual Odometry for Minimally Invasive Surgery

Zhuo Wang; Chiang-Heng Chien; Eungjoo Lee

RVO-MIS: Robust Visual Odometry for Minimally Invasive Surgery

Zhuo Wang, Chiang-Heng Chien, Eungjoo Lee

Published: 14 Feb 2026, Last Modified: 14 Feb 2026MIDL 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Visual Odometry, Minimally Invasive Surgery, Feature-based Tracking

TL;DR: This paper presents RVO-MIS, a robust visual odometry framework that combines deep learning-based feature matching with geometric pose estimation to significantly outperform existing methods in challenging minimally invasive surgery environments.

Abstract: Visual odometry (VO) in Minimally Invasive Surgery (MIS) scenarios plays a crucial role in current and future endoscopic surgical intervention assistance systems. However, MIS environments pose severely challenging situations for typical VO algorithms due to textureless environments, the movement of surgical instruments, different lighting angles, smoke generated during surgery, and organ deformation. Recent advances in this domain have increasingly incorporated deep learning-based depth estimation techniques into photometric tracking frameworks, aiming to address the inherent challenges posed by textureless regions. Yet, photometric tracking remains fragile, particularly in MIS scenes where specular reflections induce rapid and unpredictable illumination changes. In this paper, we propose a robust VO method based on feature point matching using M-Estimate Sample Consensus (MSAC) and Perspective-3-Point (P3P) absolute pose estimation to obtain accurate camera poses. To resolve the scale ambiguity, the scale of the absolute pose estimation is fixed by constructing a point cloud in the coordinate system of the first image through triangulating 3D points between keyframes. Evaluated on the SCARED dataset, our approach demonstrates consistently accurate camera pose estimation, achieving a translation ATE (RMSE) of 0.2970 cm in the best case. Quantitative results indicate that our method significantly outperforms established baseline methods in both translation and rotation metrics, validating its robustness in challenging MIS environments.

Primary Subject Area: Application: Endoscopy

Secondary Subject Area: Integration of Imaging and Clinical Data

Registration Requirement: Yes

Read CFP & Author Instructions: Yes

Originality Policy: Yes

Single-blind & Not Under Review Elsewhere: Yes

LLM Policy: Yes

Submission Number: 34

Loading