Match me if you can: Semi-supervised semantic correspondence learning with unpaired images

Jiwon Kim, Byeongho Heo, Sangdoo Yun, Seungryong Kim, Dongyoon Han

Published: 09 Dec 2024, Last Modified: 13 May 2026ACCV 2024EveryoneCC BY 4.0

Abstract: Semantic correspondence methods have advanced to obtaining high-quality correspondences employing complicated networks, aiming to maximize the model capacity. However, despite the performance improvements, they may remain constrained by the scarcity of training keypoint pairs, a consequence of the limited training images and the sparsity of keypoints. This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences and uncovers the models can be more trained by employing densified training pairs. We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision, requiring neither extra labeled key points nor trainable modules from unlabeled images. Consequently, our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW and enjoy further robustness on corruption benchmarks.