Enhancing Unsupervised Visible-Infrared Person Re-Identification with Bidirectional-Consistency Gradual Matching

Xiao Teng; Xingyu Shen; Kele Xu; Long Lan

Enhancing Unsupervised Visible-Infrared Person Re-Identification with Bidirectional-Consistency Gradual Matching

Xiao Teng, Xingyu Shen, Kele Xu, Long Lan

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Unsupervised visible-infrared person re-identification (USL-VI-ReID) is of great research and practical significance yet remains challenging due to significant modality discrepancy and lack of annotations. Many existing approaches utilize variants of bipartite graph global matching algorithms to address this issue, aiming to establish cross-modality correspondences. However, these methods may encounter mismatches due to significant modality gaps and limited model representation. To mitigate this, we propose a simple yet effective framework for USL-VI-ReID, which gradually establishes associations between different modalities. To measure the confidence whether samples from different modalities belong to the same identity, we introduce a bidirectional-consistency criterion, which not only considers direct relationships between samples from different modalities but also incorporates potential hard negative samples from the same modality. Additionally, we propose a cross-modality correlation preserving module to enhance the semantic representation of the model by maintaining consistency in correlations across modalities. Extensive experiments conducted on the public SYSU-MM01 and RegDB datasets demonstrate the superiority of our method over existing USL-VI-ReID approaches across various settings, despite its simplicity. Our code will be released.

Primary Subject Area: [Experience] Multimedia Applications

Secondary Subject Area: [Engagement] Multimedia Search and Recommendation, [Content] Media Interpretation

Relevance To Conference: The task of unsupervised visible-infrared person re-identification (USL-VI-ReID) falls within the domain of multimedia content understanding and multi-modal processing, where the objective is to identify and match individuals across both visible and infrared modalities. Recently, there has been a surge in research efforts related to this task, with many notable works published in ACM MM, including but not limited to ADCA, CCLNet, DOTLA and MBCCM as referenced in our work. Compared with these works, we propose a novel method for USL-VI-ReID, which can achieve better performance despite its simplicity.

Supplementary Material: zip

Submission Number: 967

Loading