DXFeat: Depth-Aware Features for Robust Image Matching

Cian Huang; Tai-Cyuan Ciou; Jing-Ming Guo

DXFeat: Depth-Aware Features for Robust Image Matching

Cian Huang, Tai-Cyuan Ciou, Jing-Ming Guo

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Image Matching, Keypoint Detection, Sparse Matching, Semi-Dense Matching, Depth-Auxiliary

Abstract: This study introduces DXFeat, a novel architecture that integrates depth infor-mation as an auxiliary branch for keypoint detection, leveraging depth cues to enhance localization accuracy, which improves localization accuracy with an average 3.1% gain while preserving inference efficiency. DXFeat refines feature extraction during training while maintaining computational efficiency. The model incorporates a modified reliability loss and learnable weighting mechanisms, balancing accuracy and robustness. By optimizing network channels while preserving high-resolution inputs, DXFeat supports both sparse and semi-dense matching, making it well-suited for visual localization and augmented reality. A depth-assisted refinement module further enhances feature representation using coarse local descriptors. Notably, the depth auxiliary branch is only needed during training, ensuring streamlined deployment. Comprehensive evaluations on MegaDepth, ScanNet, and HPatches confirm that the combination of loss-level optimization and depth-auxiliary refinement yields consistent AUC improvements, establishing DXFeat as a strong and efficient framework for real-world image matching tasks.

Supplementary Material: pdf

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 24549

Loading