Spatiotemporal Consensus with Scene Prior for Unsupervised Domain Adaptive Person Search

Yimin Jiang; Huibing Wang; Jinjia Peng

Spatiotemporal Consensus with Scene Prior for Unsupervised Domain Adaptive Person Search

Yimin Jiang, Huibing Wang, Jinjia Peng

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Person Search

Abstract: Person Search aims to locate query persons in gallery scene images, but faces severe performance degradation under domain shifts. Unsupervised domain adaptation transfers knowledge from the labeled source domain to the unlabeled target domain and iteratively rectifies the pseudo-labels. However, the pseudo-labels are inevitably contaminated by the source-biased model, which misleads the training process. This, in turn, reduces the quality of the pseudo-labels themselves and ultimately affects the search performance. In this paper, we propose a Spatiotemporal Consensus with Scene Prior (STCSP) framework that effectively eliminates the interference of noise on pseudo-labels, establishes positive feedback, and thus gradually bridging the domain gap. Firstly, STCSP uses a Spatiotemporal Consensus pipeline to suppress the noise from being mixed into the pseudo-labels. Secondly, leveraging the scene prior, STCSP employs our designed Iterative Bilateral Extremum Matching method to prevent the occurrence of some incorrect pseudo-labels. Thirdly, we propose a Scene Prior Contrastive Learning module, which encourages the model to directly acquire the scene prior knowledge from the target domain, thereby mitigating the generation of noise. By suppressing noise contamination, avoiding noise occurrence and mitigating noise generation, our framework achieves state-of-the-art performance on two benchmark datasets, PRW with 50.2% mAP and CUHK-SYSU with 87.0% mAP.

Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)

Submission Number: 21201

Loading