Image-Centered Pseudo Label Generation for Weakly Supervised Text-Based Person Re-Identification

Published: 01 Jan 2024, Last Modified: 15 May 2025PRCV (12) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Weakly supervised text-based person re-identification aims to identify a target person using textual descriptions, where the identity annotations are not available during the training phase. Previous methods attempted to cluster images and texts simultaneously for generating pseudo identity labels. However, we observed that the number of text clusters is significantly smaller than the number of true identities, while the number of image clusters is closer to the actual number of identities. This leads to uncertain pseudo identity labels. To address this issue, we propose a new approach called Image-Centered Pseudo Label Generation (ICPG) for weakly supervised text-based person re-identification. It directly generates pseudo labels for images and texts based on image clustering results. Firstly, we introduce a cross-modal distribution matching loss, which focuses on minimizing the KL divergence between the distributions of image-text similarity and normalized pseudo label matching distributions. Secondly, to enhance cross-modal associations, we propose a cross-modal hard sample mining method to explore challenging cross-modal examples. Experimental results demonstrate the effectiveness of our proposed methods. Compared to the state-of-the-art method, our approach achieves improvements of 3.6\(\%\), 2.4\(\%\) and 3.0\(\%\) in rank-1 accuracy on three datasets, respectively.
Loading