Self-supervised multi-frame depth estimation with visual-inertial pose transformer and monocular guidance
Abstract: Highlights•A new self-supervised multi-frame depth network incorporating IMU modality.•A visual-inertial fusion Transformer to improve pose estimation involved in multi-frame depth.•A monocular guided excitation module bridges monocular and multi-frame depth branches.•Experiments demonstrate improved depth accuracy against previous approaches.
Loading