Enhancing Visible-Infrared Person Re-Identification With Modality- and Instance-Aware Adaptation Learning
Abstract: The Visible-Infrared Person Re-identification (VI ReID) aims to achieve cross-modality re-identification by matching pedestrian images from visible and infrared illumination. A crucial challenge in this task is mitigating the impact of modality divergence to enable the VI ReID model to learn cross-modality correspondence. Regarding this challenge, existing methods primarily focus on eliminating the information gap between different modalities by extracting modality-invariant information or supplementing inputs with specific information from another modality. However, these methods may overly focus on bridging the information gap, a challenging issue that could potentially overshadow the inherent complexities of cross-modality ReID itself. Based on this insight, we propose a straightforward yet effective strategy to empower the VI ReID model with sufficient flexibility to adapt diverse modality inputs to achieve cross-modality ReID effectively. Specifically, we introduce a Modality-aware and Instance-aware Visual Prompts (MIP) network, leveraging transformer architecture with customized visual prompts. In our MIP, a set of modality-aware prompts is designed to enable our model to dynamically adapt diverse modality inputs and effectively extract information for identification, thereby alleviating the interference of modality divergence. Besides, we also propose the instance-aware prompts, which are responsible for guiding the model to adapt individual pedestrians and capture discriminative clues for accurate identification. Through extensive experiments on four mainstream VI ReID datasets, the effectiveness of our designed modules is evaluated. Furthermore, our proposed MIP network outperforms most current state-of-the-art methods.
External IDs:dblp:journals/tcsv/WuJLWWW25
Loading