Inter-Modality Similarity Learning for Unsupervised Multi-Modality Person Re-Identification

Published: 01 Jan 2024, Last Modified: 13 May 2025IEEE Trans. Circuits Syst. Video Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: RGB (visible), near-infrared (NI), and thermal infrared (TI) imaging modalities are commonly combined for round-the-clock surveillance. We introduce a novel unsupervised multi-modality person re-identification (MM-ReID) task, which, based on an individual’s image in any one modality, seeks to identify matches in the other two modalities. Compared to prior MM-ReID problem formulations, unsupervised MM-ReID significantly reduces labeling cost and imaging constraints. To address the unsupervised MM-ReID task, we propose a novel inter-modality similarity learning (IMSL) framework consisting of four synergistic interconnected modules: modality mean clustering (MMC), multi-modality reliability estimation (MMRE), shape-based mutual reinforcement (SMR), and modality-aware invariant learning (MIL). MMC iterates with SMR and MIL in a mutually beneficial manner to provide pseudo-labels that are robust to modality gap. MMRE normalizes sample weights, mitigating the impact of noisy labels in the multi-modality setting. SMR emphasizes shape information to implicitly enhance the model’s robustness to the modality gap and is additionally guided by pseudo-labels provided by MMC to attend to identity-related details. MIL explicitly encourages learning of modality-invariant and identity-related features via contrastive feedback for the MMC module. Extensive experimental results on the multi-modality and cross-modality datasets demonstrate that IMSL provides substantial performance gains over existing methods. Code is made available at https://github.com/zqpang/IMSL .
Loading