Intermediary-Guided Bidirectional Spatial-Temporal Aggregation Network for Video-Based Visible-Infrared Person Re-IdentificationDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 10 Nov 2023IEEE Trans. Circuits Syst. Video Technol. 2023Readers: Everyone
Abstract: This work focuses on the task of Video-based Visible-Infrared Person Re-Identification, a promising technique for achieving 24-hour surveillance systems. Two main issues in this field are modality discrepancy mitigating and spatial–temporal information mining. In this work, we propose a novel method, named Intermediary-guided Bidirectional spatial–temporal Aggregation Network (IBAN), to address both issues at once. Specifically, IBAN is designed to learn modality-irrelevant features by leveraging the anaglyph data of pedestrian images to serve as the intermediary. Furthermore, a bidirectional spatial–temporal aggregation module is introduced to exploit the spatial–temporal information of video data, while mitigating the impact of noisy image frames. Finally, we design an Easy-sample-based loss to guide the final embedding space and further improve the model’s generalization performance. Extensive experiments on Video-based Visible-Infrared benchmarks show that IBAN achieves promising results and outperforms the state-of-the-art ReID methods by a large margin, improving the rank-1/mAP by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.29\%/3.46\%$ </tex-math></inline-formula> at the Infrared to Visible situation, and by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5.04\%/3.27\%$ </tex-math></inline-formula> at the Visible to Infrared situation. The source code of the proposed method will be released at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/lhf12278/IBAN</uri> .
0 Replies

Loading