More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identificationOpen Website

Published: 01 Jan 2022, Last Modified: 12 May 2023ACM Multimedia 2022Readers: Everyone
Abstract: Occluded person re-identification (re-ID) has been a long-standing challenge in surveillance systems. Most existing methods tackle this challenge by aligning spatial features of human parts according to external semantic cues, which are inferred from the off-the-shelf semantic models (e.g. human parsing and pose estimation). However, there is a significant domain gap between the images in re-ID datasets and the images used for training the semantic models, such that inevitably making those semantic cues unreliable and deteriorating the re-ID performance. Multi-source knowledge ensemble has been proved to be effective for domain adaptation. Inspired by this, we propose a multi-source dynamic parsing attention (MSDPA) mechanism that leverages knowledge learned from different source datasets to generate reliable semantic cues and dynamically integrate and adapt them in a self-supervised manner by attention mechanism. Specifically, we first design a parsing embedding module (PEM) to integrate and embed the multi-source semantic cues into the patch tokens through a voting procedure. To further exploit correlations among body parts with similar semantics, we design a dynamic parsing attention block (DPAB) to guide the patch sequences aggregation by prior attentions which are dynamically generated from human parsing results. Extensive experiments over occluded, partial, and holistic re-ID datasets show that the MSDPA achieves superior re-ID performance consistently and outperforms the state-of-the-art methods by large margins on occluded datasets.
0 Replies

Loading