Keywords: stereo matching, deep learning
Abstract: High-performance real-time stereo matching methods invariably rely on 3D regularization of the 4D cost volume, which is unfriendly to mobile devices.
While methods based on 2D regularization of 3D cost volume struggles in ill-posed regions.
In this paper, we propose Decoupling Bidirectional Geometric Representations of 4D cost volume and present a deployment-friendly network DBStereo, which is based on pure 2D convolutions.
Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight decoupled bidirectional geometry aggregation block to capture spatial and disparity representation respectively.
Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously.
Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based methods such as RAFT-Stereo and IGEV-Stereo.
Our study breaks the empirical design of using 3D convolution for 4D cost volume and provides a simple yet strong baseline, i.e., the proposed decoupled aggregation paradigm, to facilitate further study.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 10700
Loading