Prompting Continual Person Search

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The development of person search techniques has been greatly promoted in recent years for its superior practicality and challenging goals. Despite their significant progress, existing person search models still lack the ability to continually learn from increasing real-world data and adaptively process input from different domains. To this end, this work introduces the continual person search task that sequentially learns on multiple domains and then performs person search on all seen domains. This requires balancing the stability and plasticity of the model to continually learn new knowledge without catastrophic forgetting. For this, we propose a \textbf{P}rompt-based C\textbf{o}ntinual \textbf{P}erson \textbf{S}earch (PoPS) model in this paper. First, we design a compositional person search transformer to construct an effective pre-trained transformer without exhaustive pre-training from scratch on large-scale person search data. This serves as the fundamental for prompt-based continual learning. On top of that, we design a domain incremental prompt pool with a diverse attribute matching module. For each domain, we independently learn a set of prompts to encode the domain-oriented knowledge. Meanwhile, we jointly learn a group of diverse attribute projection and prototype embeddings to capture discriminative domain attributes. By matching an input image with the learned attributes across domains, the learned prompts can be properly selected for model inference. Extensive experiments are conducted to validate the proposed method for continual person search. The source code will be made available upon publication.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Engagement] Multimedia Search and Recommendation
Relevance To Conference: The person search task aims to localize and recognize a target person from uncropped scene images captured by different cameras. It is practical for applications of multimedia such as video surveillance and trajectory tracking. In this paper, we for the first time propose to enable continual person search that continually learns from increasing real-world data of different domains and adaptively process images from learned domains. This helps to expand the capability of existing person search methods for continually learning to adapt to different domains without catastrophic forgetting of learned domain prior knowledge, facilitating more robust and effective person search models in multimedia applications.
Supplementary Material: zip
Submission Number: 368
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview