HOH-Net: High-Order Hierarchical Middle-Feature Learning Network for Visible-Infrared Person Re-Identification

Liuxiang Qiu, Si Chen, Jing-Hao Xue, Da-Han Wang, Shunzhi Zhu, Yan Yan

Published: 2026, Last Modified: 28 Feb 2026IEEE Trans. Circuits Syst. Video Technol. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval task that aims to match images of the same person across visible (VIS) and infrared (IR) modalities. Existing VI-ReID methods ignore high-order structure information of features and struggle to learn a reliable common feature space due to the modality discrepancy between VIS and IR images. To alleviate the above issues, we propose a novel high-order hierarchical middle-feature learning network (HOH-Net) for VI-ReID. We introduce a high-order structure learning (HSL) module to explore the high-order relationships of short- and long-range feature nodes, for significantly mitigating model collapse and effectively obtaining discriminative features. We further develop a fine-coarse graph attention alignment (FCGA) module, which efficiently aligns multi-modality feature nodes from node-level and region-level perspectives, ensuring reliable middle-feature representations. Moreover, we exploit a hierarchical middle-feature agent learning (HMAL) loss to hierarchically reduce the modality discrepancy at each stage of the network by using the agents of middle features. The proposed HMAL loss also exchanges detailed and semantic information between low- and high-stage networks. Finally, we introduce a modality-range identity-center contrastive (MRIC) loss to minimize the distances between VIS, IR, and middle features. Extensive experiments demonstrate that the proposed HOH-Net yields state-of-the-art performance on the image-based and video-based VI-ReID datasets. The code is available at: https://github.com/Jaulaucoeng/HOS-Net

External IDs:dblp:journals/tcsv/QiuCXWZY26