Person Re-Identification With Arbitrary Modalities: A Multi-Modal Dataset and a Unified Framework

Published: 2025, Last Modified: 20 Jan 2026IEEE Trans. Inf. Forensics Secur. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper proposes a unified visual person re-identification (re-id) framework capable of handling various re-id tasks, including modal-fusion re-id, cross-modal re-id, and single-modal re-id, to accommodate diverse modal scenarios. We begin by constructing a Multi-modal Person Re-identification (MPR) dataset comprising RGB, infrared (IR), and depth modalities. Then, the unified re-id framework is established by integrating an Adaptive Modality Aggregation Module (AMAM) and Multi-modal Auto-aligned Learning (MAL). The former autonomously aggregates distinct modalities by thoroughly exploring their relationships. It not only benefits modal-fusion re-id by promoting the modal-fusion representations, but also enhances cross-modal re-id by performing modal consistency learning on the modal-fusion features to narrow modal gaps. The latter automatically aligns multiple modalities through contrastive learning constraints to lessen modal gaps for multiple cross-modal re-id tasks. So, these two modules respectively balance the tasks of distinct types and various tasks of the same type, which are beneficial to realize more re-id tasks with diverse modal scenarios. Moreover, we evaluate state-of-the-art (SOTA) multi-modal methods in terms of plentiful testing settings constructed on MPR dataset. The experiments demonstrate that the proposed unified method that only needs to be trained once outperforms existing methods that require multiple training processes with specific modalities. Besides, it can cope with more scenarios. Extensive ablation studies investigate the effects of the proposed modules on all re-id tasks. Our datasets and code will be publicly available soon: https://github.com/hfutwujingjing/A-Multi-Modal-Dataset-and-A-Unified-Framework
Loading