Abstract: Current deep homography estimation methods are typically constrained to processing low-resolution image pairs due to network architecture and computational limitations. For high-resolution images, downsampling is often required, which can greatly degrade estimation accuracy. In contrast, image matching methods, which match pixels and compute homography from correspondences, provide greater resolution flexibility. So in this work, we revisit the traditional image matching paradigm for homography estimation and propose GFNet, a Grid Flow regression Network that adapts the high-accuracy dense matching framework for homography estimation while enhancing efficiency through a grid-based strategy--estimating flow only over a coarse grid by leveraging homography's global smoothness. We demonstrate the effectiveness of GFNet on a wide range of experiments on multiple datasets, including the common scene MSCOCO, multimodal datasets VIS-IR and GoogleMap, and the dynamic scene VIRAT. Notably, on 448x448 GoogleMap, GFNet achieves an improvement of+ 13.5% in auc@ 3 while reducing MACs by~ 47% compared to the SOTA dense matching method. Additionally, it shows a 1.8 ximprovement in auc@ 3 over the SOTA deep homography method. Code is available at\textcolor [rgb] 0.95, 0.08, 0.58 https://github. com/KN-Zhang/GFNet.
Loading