DS-Stereo: Deep-Shallow Information Interaction for Stereo Matching

Published: 2025, Last Modified: 19 Mar 2026IEEE Robotics Autom. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Stereo matching methods based on iterative optimization have become a popular research direction for accuracy stereo matching. However, existing methods often neglect the interaction between deep and shallow layers, making it difficult to simultaneously capture global low-frequency and local high-frequency information. To address this issues, we propose DS-Stereo, which fully interacts deep and shallow information to reduce ambiguity in ill-posed regions. Firstly, DS-Stereo incorporates an Adjacent Feature Hybrid Attention (AFHA) Block after the feature extraction network to fuse global and local information from adjacent feature maps. In addition, we introduce a Hierarchical Cost Aggregation (HCA) Module to integrate geometric details from both deep and shallow layers during cost aggregation. Finally, to overcome the limitations of traditional recurrent units, we design a Selective Inception-based Iterative Unit (SIIU) with a larger receptive field and stronger convergence capability. Experimental results on the Scene Flow, KITTI 2012, KITTI 2015, and Middlebury demonstrate that DS-Stereo outperforms almost all current state-of-the-art stereo matching methods and exhibits strong robustness in ill-posed regions.
Loading