Dynamic Cues-Assisted Transformer for Robust Point Cloud Registration

Hong Chen, Pei Yan, Sihe Xiang, Yihua Tan

Published: 15 Jun 2024, Last Modified: 16 Oct 2025CVPR 2024 (IEEE/CVF Conference on Computer Vision and Pattern Recognition)EveryoneCC BY 4.0

Abstract: Point Cloud Registration is a critical and challenging task in computer vision. Recent advancements have pre-dominantly embraced a coarse-to-fine matching mechanism, with the key to matching the superpoints located in patches with interframe consistent structures. How-ever, previous methods still face challenges with ambiguous matching, because the interference information aggregated from irrelevant regions may disturb the capture of interframe consistency relations, leading to wrong matches. To address this issue, we propose Dynamic Cues-Assisted Transformer (DCATr). Firstly, the interference from irrelevant regions is greatly reduced by constraining attention to certain cues, i.e., regions with highly correlated structures of potential corresponding superpoints. Secondly, cues-assisted attention is designed to mine the interframe consistency relations, while more attention is assigned to pairs with high consistent confidence in feature aggregation. Finally, a dynamic updating fashion is proposed to facilitate mining richer consistency information, further improving aggregated features' distinctiveness and relieving matching ambiguity. Extensive evaluations on indoor and outdoor standard benchmarks demonstrate that DCATr outperforms all state-of-the-art methods.