NAC: Mitigating Noisy Correspondence in Cross-Modal Matching Via Neighbor Auxiliary Corrector

Published: 01 Jan 2024, Last Modified: 18 May 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The presence of noisy correspondence within cross-modal matching has significantly undermined the performance of existing matching methods. In this paper, we introduce a robust framework named Neighbor Auxiliary Corrector (NAC) for alleviating noise by utilizing the neighbors, which are indicative of similar textual targets. NAC is inspired by an observation that similar texts tend to correspond to similar images. Leveraging the zero-shot capabilities of Pre-trained Language Models (PLMs), we identify the top-k nearest neighbors for each positive image-text pair. Subsequently, the side information provided by these neighbors is harnessed for both sample verification and sample rectification. Extensive experiments on benchmark datasets demonstrate that our framework can significantly boost the performance and is more robust to various levels of noisy correspondence.
Loading