Gradually Spatio-Temporal Feature Activation for Target Tracking

Published: 01 Jan 2024, Last Modified: 11 Apr 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most existing transformer-based trackers use ViT [1] as the backbone to extract and fuse feature tokens of target templates and search region. Since both the target template and the search region contain background information, their tokens are prone to background interference in interaction that affects tracking performance. We propose a GFATrack that combines spatiotemporal information with prominent target features. The tracker mainly consists of the Dynamic Template Refinement branch and the Search Feature Enhancement branch. The former activates the target features in the dynamic template and provides temporal information. The latter enhances search features by aggregating spatial information from the initial template and temporal information from the dynamic template to achieve precise tracking. Our proposed Feature Activation Module can effectively fuse refined features with reference features and highlight high-similarity features among them. At the same time, we propose a concise and effective dynamic threshold update strategy to capture time context update dynamic templates from historical prediction results. Many experiments have verified the effectiveness and latest performance of the proposed method.
Loading