Two-View Correspondence Pruning via Channel-Spatial Interaction and Bidirectional Consensus Interaction

Xiangui Huang, Taotao Lai, Yizhang Liu, Shuyuan Lin, Zuoyong Li

Published: 27 Oct 2025, Last Modified: 01 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Accurately identifying correct correspondences in two images is a crucial task in computer vision. Current methods predominantly use PointCN blocks as feature extraction backbones and learn local-global consensus through a progressive learning strategy. However, such methods have two main drawbacks: First, PointCN blocks, composed of multilayer perceptrons and normalization layers, process spatial positions independently, leading to limited interaction between channel-wise and spatial-wise dimensions. Second, the progressive learning strategy primarily focuses on unidirectional transfer from local to global consensus, yet neglects the bidirectional interaction between local and global consensus. To address these issues, we propose the Channel-Spatial interaction and Bidirectional Consensus interaction-Based Network (CSBCNet), which contains three innovative blocks: Channel-Spatial Interaction (CSI), Local Consensus Mining (LCM), and Global Consensus-Aware Attention (GCAA). Specifically, CSI enhances interaction between channel-wise and spatial-wise dimensions through a dual-path attention mechanism, addressing the limited interaction caused by the independent processing of spatial positions in PointCN blocks. LCM extracts reliable local consensus by modeling geometric structures and spatial continuity within correspondences. GCAA captures global consensus by aggregating correspondences that are highly likely to be correct ones, and achieves bidirectional interaction between local and global consensus through cross attention. Experiments demonstrate our CSBCNet's superior performance in camera pose estimation and correspondence pruning. Notably, when the CSI block is applied to the existing OANet and MS2DGNet networks, it achieves significant performance improvements of 10.27% and 7.5%, respectively, on the mAP5° metric on the camera pose estimation task.

External IDs:doi:10.1145/3746027.3755581