Abstract: Stereo matching methods based on iterative optimization have become a popular research direction for accuracy stereo matching. However, existing methods often neglect the interaction between deep and shallow layers, making it difficult to simultaneously capture global low-frequency and local high-frequency information. To address this issues, we propose DS-Stereo, which fully interacts deep and shallow information to reduce ambiguity in ill-posed regions. Firstly, DS-Stereo incorporates an Adjacent Feature Hybrid Attention (AFHA) Block after the feature extraction network to fuse global and local information from adjacent feature maps. In addition, we introduce a Hierarchical Cost Aggregation (HCA) Module to integrate geometric details from both deep and shallow layers during cost aggregation. Finally, to overcome the limitations of traditional recurrent units, we design a Selective Inception-based Iterative Unit (SIIU) with a larger receptive field and stronger convergence capability. Experimental results on the Scene Flow, KITTI 2012, KITTI 2015, and Middlebury demonstrate that DS-Stereo outperforms almost all current state-of-the-art stereo matching methods and exhibits strong robustness in ill-posed regions.
External IDs:dblp:journals/ral/LinDW25
Loading