CF-ACV: Fast-ACV Based on Context and Geometry Fusion for Stereo Matching

Xurong Wang, Zhuohao Gong, Wenxin Hu, Qianqian Wang, Zixuan Shangguan, Ziyuan Wen

Published: 01 Jan 2024, Last Modified: 13 May 2025SmartIoT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Stereo matching is a crucial technique in stereo depth estimation, and developing a method that is both fast and highly accurate is essential for achieving real-time and accurate stereo matching on mobile devices. The Context and Geometry Fusion (CGF) module can adaptively fuse context and geometric information, thereby enabling more effective cost aggregation and enhancing stereo matching accuracy and calculation speed. Fast-ACV, a streamlined version of ACV, is designed to reduce computational costs and increase processing speed, albeit with some performance trade-offs. To balance the need for both speed and accuracy in stereo matching, this paper proposes the CF-ACV stereo matching method based on the CGF module and Fast-ACV. This approach achieves higher accuracy while maintaining rapid computation. The model was evaluated on the Scene Flow, KITTI2012, and KITTI2015 datasets. The results demonstrated that our model consistently outperforms Fast-ACV across all three datasets. Notably, on the KITTI dataset, our model surpasses most of real-time state-of-the-art stereo matching networks. The 3-all, 3-noc, EPE-all, and EPE-noc of CF-ACV tested on KITTI2012 were 2.05%, 1.68%, 0.5, and 0.5, respectively. The D-bg, D1-fg, and D1-all tested on KITTI2015 were 1.81 %, 3.31 %, and 2.07%, respectively. Meanwhile, the CF-ACV has a faster runtime (29ms) on the KITTI dataset than Fast-ACV and most of real-time state-of-the-art methods. These results demonstrate that the proposed CF-ACV method can achieve excellent stereo matching performance while maintaining high computational speed.