Learning color prompt and position constraint for visual tracking

Xuedong He; Huiying Xu; Xinzhong Zhu; Hongbo Li; Xiao Huang; Yunliang Jiang

Learning color prompt and position constraint for visual tracking

Xuedong He, Huiying Xu, Xinzhong Zhu, Hongbo Li, Xiao Huang, Yunliang Jiang

Published: 01 Jan 2025, Last Modified: 07 Aug 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The flourish of current visual tracking cannot be separated from powerful pre-trained backbone networks. Even the pre-trained networks frozen and used merely as a feature extractor can also obtain substantial tracking performance. However, how to acquire target-aware features suitable for visual tracking has always been a hot research topic to improve tracking robustness. Inspired by prompt learning, we propose the color prompt encoder to guide the acquisition of target-aware capability. Concretely, the color histogram features as a naive feature expression can provide complementary cues, so we employ color histogram features to construct the color target probability as a color prompt. Immediately after, the color prompt constructed is integrated into the unified tracking network to guide the generation of specific target feature maps. Furthermore, Discriminative Correlation Filters (DCF)-based trackers with an online update module can effectively adapt to constantly changing objects, so it is imperative to ensure that credible prediction samples are utilized to refine the tracking model online. Hence, we further devise an uncomplicated position offset constraint method based on target motion inertia to screen more reliable prediction results. Adequate experimental results reveal the validity of the color prompt encoder and position offset constraint in the DCF tracking framework. Our trackers can perform favorably against recent and far more sophisticated trackers on multiple public benchmarks. Concretely, our proposed tracker achieves a 0.815 robustness and 0.305 expected average overlap (EAO) on Visual Object Tracking (VOT) 2020 dataset, which is superior to the baseline in robustness (+2.6 %) and EAO (+0.8 %).

Loading