A comparison of deep learning-based visual odometry algorithms in challenging scenarios

Hudson Martins Silva Bruno, Kleber M. Cabral, Esther Luna Colombini, Sidney N. Givigi

Published: 2024, Last Modified: 30 Oct 2025SysCon 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Visual Odometry (VO) estimates a camera’s position and orientation based only on visual information. Classical VO algorithms can fail when either the camera’s motion or the environment is too challenging. Recently, approaches based on Deep Neural Networks (DNNs) have achieved promising results in performing an end-to-end VO pipeline. However, they still cannot outperform classical methods. To leverage the robustness of deep learning and still be as accurate as a classical VO system, we propose to explore hybrid VO systems. These systems use DNNs to replace only some modules of a classical VO pipeline. Although some authors proposed hybrid VO algorithms in the literature, they do not evaluate the robustness of their systems in challenging situations. Moreover, most DNNs devised for a step of the VO pipeline are not evaluated inside a VO system due to the difficulty of developing the entire pipeline. Therefore, we evaluated some of the state-of-the-art algorithms for feature detection and feature matching inside a hybrid VO pipeline. The experiments were conducted in a controlled indoor environment and emulated camera failures were created to assess the robustness of the algorithms. Our results show that learning-based algorithms are more accurate than classical ones and that the feature detection module is critical to creating robust systems.

External IDs:dblp:conf/syscon/BrunoCCG24